2023-09-28 10:51:51,220 INFO [train.py:1107] (0/4) Training started 2023-09-28 10:51:51,228 INFO [train.py:1117] (0/4) Device: cuda:0 2023-09-28 10:51:51,234 INFO [train.py:1129] (0/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.16.0.dev+git.1db4d97a.clean', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '09ada8fb-dirty', 'icefall-git-date': 'Thu Sep 28 10:47:39 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.16.0.dev0+git.1db4d97a.clean-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-6-0423201309-7c68fd68fb-6cszs', 'IP address': '10.177.28.83'}, 'world_size': 4, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 30, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'ctc_loss_scale': 0.2, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_tal_csasr': True, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'blank_id': 0, 'vocab_size': 2000} 2023-09-28 10:51:51,234 INFO [train.py:1131] (0/4) About to create model 2023-09-28 10:51:52,081 INFO [train.py:1135] (0/4) Number of model parameters: 68625511 2023-09-28 10:51:58,694 INFO [train.py:1150] (0/4) Using DDP 2023-09-28 10:51:59,292 INFO [multi_dataset.py:39] (0/4) About to get multidataset train cuts 2023-09-28 10:51:59,292 INFO [multi_dataset.py:42] (0/4) Loading Aishell-2 in lazy mode 2023-09-28 10:51:59,294 INFO [multi_dataset.py:49] (0/4) Loading TAL-CSASR in lazy mode 2023-09-28 10:51:59,296 INFO [multi_dataset.py:142] (0/4) About to get train-clean-100 cuts 2023-09-28 10:51:59,297 INFO [multi_dataset.py:149] (0/4) About to get train-clean-360 cuts 2023-09-28 10:51:59,299 INFO [multi_dataset.py:156] (0/4) About to get train-other-500 cuts 2023-09-28 10:52:14,751 INFO [asr_datamodule.py:218] (0/4) Enable MUSAN 2023-09-28 10:52:14,751 INFO [asr_datamodule.py:219] (0/4) About to get Musan cuts 2023-09-28 10:52:18,014 INFO [asr_datamodule.py:243] (0/4) Enable SpecAugment 2023-09-28 10:52:18,014 INFO [asr_datamodule.py:244] (0/4) Time warp factor: 80 2023-09-28 10:52:18,015 INFO [asr_datamodule.py:254] (0/4) Num frame mask: 10 2023-09-28 10:52:18,015 INFO [asr_datamodule.py:267] (0/4) About to create train dataset 2023-09-28 10:52:18,015 INFO [asr_datamodule.py:294] (0/4) Using DynamicBucketingSampler. 2023-09-28 10:52:18,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:18,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:18,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:18,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:18,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:18,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:18,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:18,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:18,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:18,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:18,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:19,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:19,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:20,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:20,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:20,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:20,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:20,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:20,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:21,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:21,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:21,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:21,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:21,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:21,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:21,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:22,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:22,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:22,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:22,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:22,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:23,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:23,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:24,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:24,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:24,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:24,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:24,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:24,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:24,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:24,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:25,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:25,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:25,654 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:25,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:25,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:25,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:26,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:26,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:26,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:26,474 INFO [asr_datamodule.py:309] (0/4) About to create train dataloader 2023-09-28 10:52:26,475 INFO [multi_dataset.py:88] (0/4) About to get multidataset dev cuts 2023-09-28 10:52:26,475 INFO [multi_dataset.py:91] (0/4) Loading Aishell-2 DEV set in lazy mode 2023-09-28 10:52:26,478 INFO [multi_dataset.py:163] (0/4) About to get dev-clean cuts 2023-09-28 10:52:26,479 INFO [multi_dataset.py:170] (0/4) About to get dev-other cuts 2023-09-28 10:52:26,519 INFO [asr_datamodule.py:340] (0/4) About to create dev dataset 2023-09-28 10:52:27,311 INFO [asr_datamodule.py:357] (0/4) About to create dev dataloader 2023-09-28 10:52:27,312 INFO [train.py:1351] (0/4) Sanity check -- see if any of the batches in epoch 1 would cause OOM. 2023-09-28 10:52:27,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:27,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:27,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:27,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:27,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:28,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:28,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:28,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:28,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:28,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:28,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:28,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:28,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:29,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:29,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:29,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:29,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:29,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:30,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:30,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:30,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:31,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:31,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:31,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:31,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:31,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:31,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:31,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:31,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:32,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:32,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:32,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:32,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:33,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:33,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:33,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:33,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:33,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:33,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:34,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:34,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:34,583 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:35,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:35,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:35,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:35,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:35,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:35,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:36,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:36,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:36,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:36,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:36,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:36,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:36,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:36,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:36,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:36,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:36,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:37,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:37,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:37,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:37,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:37,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:37,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:38,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:38,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:38,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:38,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:38,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:39,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:39,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:39,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:39,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:39,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:39,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:39,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:39,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:39,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:41,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:41,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:41,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:41,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:41,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:41,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:42,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:42,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:42,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:42,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:42,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:43,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:43,355 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:43,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:43,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:43,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:43,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:43,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:43,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:44,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:45,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 10:52:46,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:46,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:52:46,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:46,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:52:47,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 10:52:47,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 10:52:47,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:48,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:48,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:48,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:52:48,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:48,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 10:52:48,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:49,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:52:49,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:50,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 10:52:50,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:52:50,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:52:50,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:52,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:52,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:53,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 10:52:53,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 10:52:53,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:53,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:53,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:53,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:54,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 10:52:54,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:54,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:54,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:52:55,391 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 10:52:55,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:52:56,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:56,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:56,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 10:52:56,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:52:56,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:56,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:56,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:57,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:57,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 10:52:57,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:59,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:52:59,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 10:52:59,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 10:52:59,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:52:59,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:59,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:00,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:00,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:53:00,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:53:00,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:01,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:01,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:01,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:53:01,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 10:53:01,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:53:01,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:01,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 10:53:01,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:02,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 10:53:03,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:03,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:03,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:03,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:03,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:03,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 10:53:03,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 10:53:04,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:04,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:04,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:04,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:04,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 10:53:04,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 10:53:04,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 10:53:04,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:04,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:05,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 10:53:05,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 10:53:05,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:05,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:06,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:06,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:06,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:06,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:07,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:07,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 10:53:07,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:08,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:53:08,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:08,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:08,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:08,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:08,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 10:53:08,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:53:08,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:08,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:08,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:09,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 10:53:09,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:09,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:09,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:53:09,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:53:10,349 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 10:53:10,387 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 10:53:10,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:10,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:11,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:53:11,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:11,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:12,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:12,319 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 10:53:13,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:53:13,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:13,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:14,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:14,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:14,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:15,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:15,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:15,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:15,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:15,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:15,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:15,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 10:53:15,840 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 10:53:15,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:16,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:16,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:16,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:16,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 10:53:16,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:16,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:53:16,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:16,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:16,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:16,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:16,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:17,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:17,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:17,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:17,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:17,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:17,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:18,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:18,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:18,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 10:53:18,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 10:53:18,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 10:53:19,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:19,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:19,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:20,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:20,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:20,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:20,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:20,704 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 10:53:20,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:21,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:22,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:53:22,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 10:53:22,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:22,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:22,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:22,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:23,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:23,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:23,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:23,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 10:53:24,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:24,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:24,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:24,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:24,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:24,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 10:53:25,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:53:25,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:25,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:25,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:25,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 10:53:25,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:25,936 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 10:53:26,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:27,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:27,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:27,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 10:53:27,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:27,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:28,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 10:53:28,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:53:28,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:28,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:29,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:29,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:29,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:31,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:31,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:31,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:53:31,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:31,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:53:31,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:32,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:32,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:32,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:32,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:32,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 10:53:32,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:32,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:33,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:34,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:35,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:35,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:53:36,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:36,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 10:53:36,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:36,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:36,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:36,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:37,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 10:53:37,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:37,255 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 10:53:37,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:37,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:53:37,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:37,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:38,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:38,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:38,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:38,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:40,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:41,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:41,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:53:42,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:53:42,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:53:42,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:53:42,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:42,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:42,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:53:42,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:42,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:43,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 10:53:43,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:43,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:43,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:43,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:43,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:43,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:43,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:53:44,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:44,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:53:44,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:44,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:53:45,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:45,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:46,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:46,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:46,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 10:53:47,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:47,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:47,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 10:53:47,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:47,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:47,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 10:53:48,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:48,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:49,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:49,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 10:53:49,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:49,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:53:49,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 10:53:49,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:50,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:53:50,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:53:50,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 10:53:51,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 10:53:51,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:51,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:51,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:51,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 10:53:51,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:52,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:52,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:52,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:52,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:53:53,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 10:53:53,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:53,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:53,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 10:53:53,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:54,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:53:55,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:55,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 10:53:55,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:55,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:53:56,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:56,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:56,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 10:53:56,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:53:56,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:56,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 10:53:57,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:57,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:57,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:57,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:57,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:58,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:58,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 10:53:58,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:59,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:59,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:59,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:00,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 10:54:00,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:00,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 10:54:00,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:00,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 10:54:00,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:01,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 10:54:01,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:02,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:02,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:02,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:02,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:02,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:02,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:02,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:02,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:02,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:03,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:03,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:54:03,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:04,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:04,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 10:54:04,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:05,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:05,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:05,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:05,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 10:54:05,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:06,054 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 10:54:06,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 10:54:06,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:06,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:06,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 10:54:06,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:07,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:54:07,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:07,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:07,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:07,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:08,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:09,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:54:09,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 10:54:09,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:09,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:09,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:09,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:10,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:10,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:10,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 10:54:10,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 10:54:10,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:10,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 10:54:10,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:11,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:11,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:11,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 10:54:11,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:11,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:11,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:11,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:12,076 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 10:54:12,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 10:54:12,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:12,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:13,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 10:54:13,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 10:54:13,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:13,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:14,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 10:54:15,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:54:15,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 10:54:16,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:16,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:16,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 10:54:16,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:17,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:54:17,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:17,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:17,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 10:54:18,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:54:18,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 10:54:18,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:54:18,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:18,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 10:54:19,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:19,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:19,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:19,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 10:54:19,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:19,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:19,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:19,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 10:54:19,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:54:20,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:54:20,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:54:21,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:21,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:21,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 10:54:21,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 10:54:22,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:23,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:23,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:24,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:24,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:24,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 10:54:24,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 10:54:24,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 10:54:24,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:25,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:25,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:54:25,447 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 10:54:25,477 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 10:54:25,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:25,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:25,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:54:26,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:54:26,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:26,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:54:26,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 10:54:26,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:27,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:54:27,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:54:27,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 10:54:27,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:54:28,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 10:54:28,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 10:54:28,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:28,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:29,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:29,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:29,418 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 10:54:29,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:30,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:54:30,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:30,805 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 10:54:30,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 10:54:31,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:31,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:54:31,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 10:54:32,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:54:32,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:32,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:32,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:33,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:33,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:54:34,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:54:34,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:34,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 10:54:34,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:54:34,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:34,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:54:34,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:34,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:34,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 10:54:35,175 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 10:54:35,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:35,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:35,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:35,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:35,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:54:36,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 10:54:36,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:54:36,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:37,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:38,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:38,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:39,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 10:54:39,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:39,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:39,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 10:54:39,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:40,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:40,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 10:54:40,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 10:54:40,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:40,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 10:54:41,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:41,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:41,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:41,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:41,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:41,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:54:41,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:42,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 10:54:42,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:54:42,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:42,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:42,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:43,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:43,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 10:54:43,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 10:54:43,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:45,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:45,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:45,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:45,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:45,879 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 10:54:46,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:46,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:54:46,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:46,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:46,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:54:46,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:46,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 10:54:47,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 10:54:47,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:47,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:47,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:47,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:47,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:47,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:48,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:48,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:48,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 10:54:48,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:54:48,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:48,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:49,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:49,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:54:49,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:54:50,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 10:54:50,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 10:54:50,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:50,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:54:50,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:52,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:52,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:54:52,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 10:54:52,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:54:53,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:53,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:53,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 10:54:53,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:54,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 10:54:54,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:54,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:55,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:54:55,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:55,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:55,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:56,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:54:57,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:57,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:57,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:58,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 10:54:59,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:54:59,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:59,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 10:55:00,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:55:00,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 10:55:00,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:55:00,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:01,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:55:01,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:55:01,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:55:01,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:55:02,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:02,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 10:55:02,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:03,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:55:03,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:03,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:04,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 10:55:04,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:04,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:05,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:05,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:55:06,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:06,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:06,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:06,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:06,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:55:06,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:55:06,898 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 10:55:06,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:06,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:07,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:07,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:07,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:55:07,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 10:55:07,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:55:08,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:55:08,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:55:08,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:08,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:55:08,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 10:55:08,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 10:55:08,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:08,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:08,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:08,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:09,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:09,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:09,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:10,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:10,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:10,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:55:10,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:11,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:11,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:11,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:11,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:12,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 10:55:12,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 10:55:13,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 10:55:13,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:13,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:13,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 10:55:14,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:14,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:14,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:15,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:55:15,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:15,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:15,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:55:15,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:16,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 10:55:16,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 10:55:17,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:55:17,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:17,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:55:17,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:17,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 10:55:18,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:18,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:18,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 10:55:19,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:19,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:20,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:20,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:55:21,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 10:55:21,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 10:55:21,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 10:55:21,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:22,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:22,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:22,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:22,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 10:55:22,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 10:55:23,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 10:55:23,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 10:55:23,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 10:55:23,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 10:55:23,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:23,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 10:55:23,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:23,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:23,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:24,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:24,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:55:24,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:24,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:24,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:55:24,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:25,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:25,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:25,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 10:55:25,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:55:25,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:26,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:26,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:55:26,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 10:55:26,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:27,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 10:55:27,328 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 10:55:27,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 10:55:27,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:27,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:27,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:55:28,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:28,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:28,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:28,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:29,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:29,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 10:55:29,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:29,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 10:55:29,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:29,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:30,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 10:55:30,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:30,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:30,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:55:30,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:31,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:31,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 10:55:31,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:31,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:32,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:32,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:32,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:32,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:55:34,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:34,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:34,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:34,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:34,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:34,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:35,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:35,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:35,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:35,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 10:55:36,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:36,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:36,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:36,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:36,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 10:55:36,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:36,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 10:55:36,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:37,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:37,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:37,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:37,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:37,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:38,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:38,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:38,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 10:55:38,868 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 10:55:38,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 10:55:38,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:55:38,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:39,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:39,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:39,811 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 10:55:39,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 10:55:40,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:55:40,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:55:41,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:41,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:42,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 10:55:42,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:42,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 10:55:43,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:43,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:43,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 10:55:43,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:55:43,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:43,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 10:55:44,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:44,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:44,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:44,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:44,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:44,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 10:55:44,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 10:55:44,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 10:55:45,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:45,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:45,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:45,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:45,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:46,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:46,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:46,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 10:55:46,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 10:55:47,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:48,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 10:55:48,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 10:55:48,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 10:55:49,003 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 10:55:49,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:49,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:49,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:55:49,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:49,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:49,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 10:55:49,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:49,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:50,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:50,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:55:50,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:50,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:55:50,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 10:55:51,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:51,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:51,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:55:51,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:51,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:51,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:52,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:52,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:55:52,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:52,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:55:53,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:55:53,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:53,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 10:55:53,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:53,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:54,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 10:55:55,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:55,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:55,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 10:55:56,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:56,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 10:55:56,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:55:56,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:56,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:56,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:56,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:57,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:57,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:58,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:58,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:58,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 10:55:59,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:59,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:55:59,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:59,839 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 10:55:59,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 10:56:00,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:56:00,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:00,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:56:02,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:02,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:02,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 10:56:02,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:02,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 10:56:02,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:02,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:03,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:03,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:03,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 10:56:03,772 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 10:56:03,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:56:04,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 10:56:04,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:04,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 10:56:05,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:05,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:05,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:05,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:56:06,223 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 10:56:06,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:06,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:06,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:06,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:06,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 10:56:07,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:56:07,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:07,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 10:56:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:08,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:08,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:08,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 10:56:09,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:56:09,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:09,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:10,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:10,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:10,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 10:56:10,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:56:10,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:56:11,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:11,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:11,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:56:11,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 10:56:11,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:12,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:12,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:12,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 10:56:12,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:12,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:12,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 10:56:12,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:13,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:13,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:13,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 10:56:13,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 10:56:14,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:15,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 10:56:15,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:16,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:16,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 10:56:16,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 10:56:16,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:16,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:17,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:17,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 10:56:17,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 10:56:18,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 10:56:18,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:18,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 10:56:18,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 10:56:18,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 10:56:18,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:18,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:20,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:20,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:20,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:20,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:20,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 10:56:20,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:20,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:56:20,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:20,642 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 10:56:21,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 10:56:21,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 10:56:22,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 10:56:22,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:23,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:23,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:56:23,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:23,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:56:23,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 10:56:23,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:56:23,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 10:56:23,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 10:56:24,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:24,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:24,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:24,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:24,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:25,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:25,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:25,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:56:25,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:26,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:26,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:56:26,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:56:26,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:27,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:27,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:56:27,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:27,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 10:56:27,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:27,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 10:56:27,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:27,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 10:56:27,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:56:29,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:29,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:29,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:29,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 10:56:29,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 10:56:29,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:30,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 10:56:30,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 10:56:30,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:31,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:56:31,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:56:31,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:31,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:32,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:56:32,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 10:56:32,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 10:56:32,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 10:56:32,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:32,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:56:33,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 10:56:33,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:33,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:33,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:34,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:34,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:34,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:34,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 10:56:34,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:34,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 10:56:34,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 10:56:35,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:36,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:36,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:37,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:56:37,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:37,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:37,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 10:56:37,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:38,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:56:38,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:38,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:38,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 10:56:38,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:56:39,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:39,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:39,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:39,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:56:40,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:40,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 10:56:40,871 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 10:56:40,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:41,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:41,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:56:41,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:41,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 10:56:41,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:56:41,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:56:41,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:41,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:41,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 10:56:43,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:43,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 10:56:43,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:44,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:56:44,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 10:56:44,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:56:44,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:45,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:45,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:45,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 10:56:45,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:45,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:45,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 10:56:45,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:45,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 10:56:46,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:46,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:46,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:46,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:47,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:47,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:47,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:47,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 10:56:47,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:47,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 10:56:48,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:48,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:48,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 10:56:49,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:50,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:50,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:50,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 10:56:50,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:50,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:50,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 10:56:51,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:51,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:52,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:53,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:53,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 10:56:53,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:53,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:54,224 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 10:56:54,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:55,262 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 10:56:55,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:55,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:56:55,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:56:56,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:57,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:57,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:57,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:57,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:57,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:58,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:56:58,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:58,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:58,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:58,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:58,942 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 10:56:59,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 10:56:59,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:00,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:57:00,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:00,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:00,576 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 10:57:00,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:01,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:57:01,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:01,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 10:57:01,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:02,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 10:57:02,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 10:57:02,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:03,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:03,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:03,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:57:04,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:04,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:04,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:57:04,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 10:57:04,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:57:04,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:04,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:57:04,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:04,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:05,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:05,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:57:05,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 10:57:06,440 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 10:57:06,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:06,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:07,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:07,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:07,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 10:57:08,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:08,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:08,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 10:57:08,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:08,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:09,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:57:09,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:09,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:57:09,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:09,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:57:10,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 10:57:10,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:10,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:11,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:11,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:11,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:11,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:57:12,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 10:57:12,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:57:12,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:12,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 10:57:12,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:12,923 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 10:57:12,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:12,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:13,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:13,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:13,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:14,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 10:57:14,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 10:57:14,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 10:57:14,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:14,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 10:57:14,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:15,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:15,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:15,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 10:57:15,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:57:15,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:57:15,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:57:15,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:16,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 10:57:16,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:16,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:16,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:57:17,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:18,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:18,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 10:57:18,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:18,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:57:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:19,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:19,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:57:19,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 10:57:20,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:20,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:57:20,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 10:57:20,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:57:21,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:21,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:21,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:21,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:21,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:57:22,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:22,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 10:57:22,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:22,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:57:23,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 10:57:23,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:57:23,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:23,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:23,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 10:57:23,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:23,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 10:57:24,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:25,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:25,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:25,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 10:57:25,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 10:57:25,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 10:57:26,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:26,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 10:57:27,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:27,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 10:57:28,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:28,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:28,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:29,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:29,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:29,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:29,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:30,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 10:57:30,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:30,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:30,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 10:57:30,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:30,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:30,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 10:57:31,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 10:57:32,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 10:57:32,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:32,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 10:57:33,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:34,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:34,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:34,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 10:57:35,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:35,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 10:57:35,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:35,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:35,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:36,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 10:57:36,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:57:37,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 10:57:37,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 10:57:38,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 10:57:38,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:39,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:39,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:40,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 10:57:40,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 10:57:41,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:41,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:41,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:42,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:57:42,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:42,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:57:43,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:43,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:44,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 10:57:44,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:44,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:44,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:44,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:44,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:57:44,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:45,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:45,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 10:57:45,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:46,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:46,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:57:47,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 10:57:48,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:57:48,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:48,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:57:48,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:48,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:49,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:57:49,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:49,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:49,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:50,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:50,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:57:50,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:50,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 10:57:50,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:50,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 10:57:50,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:51,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:51,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 10:57:51,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:51,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:57:51,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 10:57:51,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:51,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:57:51,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:52,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:53,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:57:53,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:53,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:53,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:53,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:53,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:57:53,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:53,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:54,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 10:57:54,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:55,169 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 10:57:55,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:55,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:57:55,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:55,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 10:57:56,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:56,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 10:57:56,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 10:57:56,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:57,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:57,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:57,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 10:57:57,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 10:57:57,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 10:57:58,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:58,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:58:00,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 10:58:00,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:58:00,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:58:00,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:00,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:00,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:58:00,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 10:58:01,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:58:01,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:58:01,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:01,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:01,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:01,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:02,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:02,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 10:58:02,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:58:02,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:02,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:03,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 10:58:03,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 10:58:03,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:03,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 10:58:04,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:58:04,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:58:04,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:04,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:04,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 10:58:04,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:04,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:05,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 10:58:05,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:05,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:58:05,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 10:58:07,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:58:07,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:58:07,890 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 10:58:07,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:07,985 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 10:58:08,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:08,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:08,459 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 10:58:08,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:58:08,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 10:58:09,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:09,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:09,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:09,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:09,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:09,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:58:10,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 10:58:10,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 10:58:10,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:10,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 10:58:10,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 10:58:10,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:10,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:10,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:10,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:11,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:11,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:11,559 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 10:58:11,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:11,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:58:11,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:58:12,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:58:12,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 10:58:12,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:12,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 10:58:12,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 10:58:12,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 10:58:12,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:12,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:13,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:14,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 10:58:14,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 10:58:15,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:15,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:15,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:58:15,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:15,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 10:58:16,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:58:16,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:17,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:17,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:17,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:17,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 10:58:17,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:17,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:17,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:17,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 10:58:17,889 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 10:58:18,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:18,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 10:58:19,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:19,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:19,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 10:58:19,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:20,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:20,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:20,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:20,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:58:21,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:21,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 10:58:21,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 10:58:21,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 10:58:22,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:22,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 10:58:22,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:22,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:23,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:23,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 10:58:24,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:24,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 10:58:24,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:24,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 10:58:24,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 10:58:25,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:25,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 10:58:26,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:26,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:26,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:26,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 10:58:26,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 10:58:27,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:27,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:27,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:28,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:58:28,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:58:28,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:58:28,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:29,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:29,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:29,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 10:58:29,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:29,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 10:58:30,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:30,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:30,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:30,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 10:58:30,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 10:58:30,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 10:58:31,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 10:58:31,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:31,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:31,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:31,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:58:31,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:32,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 10:58:32,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:32,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:32,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:32,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:58:32,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 10:58:32,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 10:58:33,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:58:33,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:58:35,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 10:58:35,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:35,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 10:58:36,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:36,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:36,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:36,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:36,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:36,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:37,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:37,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:37,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:37,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:37,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:37,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:37,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:38,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 10:58:38,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:38,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 10:58:38,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 10:58:38,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 10:58:38,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:38,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:38,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:38,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:38,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 10:58:39,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:39,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:39,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:40,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 10:58:40,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:40,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:58:40,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 10:58:40,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:40,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:40,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:41,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:41,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:41,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 10:58:42,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:58:43,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:43,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:44,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:58:44,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:44,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:44,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:44,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 10:58:45,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:45,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:45,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:58:45,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:58:45,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 10:58:45,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 10:58:45,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:46,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 10:58:46,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:47,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:47,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:47,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:47,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:58:48,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 10:58:48,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:48,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:48,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 10:58:48,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:48,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:48,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:48,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:49,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:49,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:49,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:49,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:58:49,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:50,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:50,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 10:58:50,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:50,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:50,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 10:58:51,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:52,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:52,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:58:52,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 10:58:52,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:52,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:52,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:53,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 10:58:53,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:53,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 10:58:53,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:54,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:54,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:58:54,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 10:58:55,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:55,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 10:58:56,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:58:56,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:56,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:57,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:57,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:57,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:57,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:57,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:58:58,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:58,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 10:58:58,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:58:58,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 10:58:58,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:59,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:59,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:58:59,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:58:59,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:58:59,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:00,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:00,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:00,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:59:01,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:01,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 10:59:01,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:01,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:59:01,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:01,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:59:01,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:59:01,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:02,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:59:02,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:02,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:59:03,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:03,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 10:59:03,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:04,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:04,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:04,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:04,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:04,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:59:05,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 10:59:05,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:05,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:06,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 10:59:06,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 10:59:06,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 10:59:06,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:06,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:06,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:06,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:59:07,540 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 10:59:07,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:07,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:08,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 10:59:08,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 10:59:08,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:59:08,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:09,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:59:09,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 10:59:10,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:10,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 10:59:10,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:10,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:10,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:10,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 10:59:11,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:11,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:11,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 10:59:11,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:11,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:12,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:12,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:12,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:12,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:59:12,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:12,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:12,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:13,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:13,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:13,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 10:59:14,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 10:59:14,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 10:59:15,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:15,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 10:59:15,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:59:17,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:17,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 10:59:17,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:17,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:18,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 10:59:18,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:18,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:18,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:18,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:19,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:19,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:59:19,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:19,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:59:19,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:19,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:20,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:59:20,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 10:59:20,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:21,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:21,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:59:21,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 10:59:21,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 10:59:21,353 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 10:59:21,477 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 10:59:21,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:21,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:21,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:21,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:21,941 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 10:59:21,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:22,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:22,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:59:22,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:22,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:22,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 10:59:23,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:23,433 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 10:59:23,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:59:23,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:24,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:24,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:24,660 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 10:59:24,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 10:59:24,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:25,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:25,107 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 10:59:25,173 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 10:59:25,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 10:59:25,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:26,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 10:59:26,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 10:59:27,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 10:59:28,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 10:59:28,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:28,323 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 10:59:28,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 10:59:28,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 10:59:28,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 10:59:28,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:29,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 10:59:29,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:59:30,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:30,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 10:59:30,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:31,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 10:59:31,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:31,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:59:31,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:32,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:32,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:32,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:32,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 10:59:32,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:59:32,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:32,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:33,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:33,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:33,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:33,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:33,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:34,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:59:34,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:34,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:34,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 10:59:34,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:59:34,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:34,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:35,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:35,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:35,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:35,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:35,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:35,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:59:35,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:35,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:36,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:37,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:37,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:37,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:59:37,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 10:59:37,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:37,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:37,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:38,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:38,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:38,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:59:39,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:39,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:39,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 10:59:39,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:39,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:40,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:40,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:41,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:41,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:41,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:41,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:42,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:59:42,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:42,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 10:59:42,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:42,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:42,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 10:59:42,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:43,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:44,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:44,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:44,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:44,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:44,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 10:59:45,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:45,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:45,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 10:59:45,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:45,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:46,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:46,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 10:59:46,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:46,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:46,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:46,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 10:59:46,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:47,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 10:59:47,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:47,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:47,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:59:47,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:47,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:48,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:48,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 10:59:48,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 10:59:48,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:48,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:49,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:49,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:49,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:49,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:49,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:49,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:49,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:49,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:50,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:51,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:51,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 10:59:51,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:59:51,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:52,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:52,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:52,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:53,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:53,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:53,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:59:53,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:59:53,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:53,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:53,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:54,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:54,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:54,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:59:55,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:55,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:55,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 10:59:55,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:55,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:55,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:59:56,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:56,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:57,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 10:59:57,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:58,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 10:59:58,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:58,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:58,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:58,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:59:59,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:59,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:59,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:59,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:59,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:00,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:00:00,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:00:00,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:01,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:00:01,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:01,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 11:00:02,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:02,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:02,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:00:03,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 11:00:03,694 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 11:00:03,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:03,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:03,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:00:04,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:04,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 11:00:04,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 11:00:05,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:00:05,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:00:05,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:05,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:05,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:05,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 11:00:06,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:00:06,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 11:00:06,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 11:00:06,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:06,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:06,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 11:00:06,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:00:07,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 11:00:07,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:00:07,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:07,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:08,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:08,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 11:00:08,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:08,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:00:08,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 11:00:08,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:08,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 11:00:08,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 11:00:08,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 11:00:09,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:09,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:09,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:09,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:10,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:10,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:10,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 11:00:10,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:10,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:10,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:10,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 11:00:10,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 11:00:10,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 11:00:11,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:00:12,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:12,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 11:00:12,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:13,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:00:13,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:13,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:13,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:00:13,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:13,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:13,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:13,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:13,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:14,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 11:00:14,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 11:00:14,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:14,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:14,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:14,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:14,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:15,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:00:15,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:15,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:16,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:16,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:16,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:16,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:16,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:16,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:17,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:17,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 11:00:17,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:18,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:18,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:18,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:18,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:18,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:19,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:19,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:19,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:19,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 11:00:19,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:00:19,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:19,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:19,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:20,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:20,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:20,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:20,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:20,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 11:00:20,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:21,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:21,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:21,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:21,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:00:21,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:21,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:21,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 11:00:21,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 11:00:21,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:00:22,050 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 11:00:22,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:22,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:22,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 11:00:22,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:22,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 11:00:22,419 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 11:00:22,419 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 11:00:22,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 11:00:22,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:22,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:22,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:22,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:23,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:00:23,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:23,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:24,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:24,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 11:00:24,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:26,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:26,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:26,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:26,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:26,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:26,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:26,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 11:00:27,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 11:00:28,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:00:28,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 11:00:29,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:29,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:29,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:30,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:30,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 11:00:31,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:31,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:31,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:00:31,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:32,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:32,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:32,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:33,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 11:00:33,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:33,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 11:00:34,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:34,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:00:34,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:35,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:35,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:35,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:35,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:35,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:00:35,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:35,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:00:36,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:00:36,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:36,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:36,877 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 11:00:37,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:00:37,233 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 11:00:37,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:00:37,445 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 11:00:37,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:37,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:00:37,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:38,125 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 11:00:38,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:38,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:39,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:39,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:00:39,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:39,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:40,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:40,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 11:00:40,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:40,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:40,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 11:00:40,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:40,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:41,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:42,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:42,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:00:42,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:00:42,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 11:00:42,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:43,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:43,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:43,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:44,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:44,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:44,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:44,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:45,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:45,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:00:45,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:00:45,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:46,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:00:46,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:00:47,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:00:47,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 11:00:47,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:47,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:00:47,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 11:00:47,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:47,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:49,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:49,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:49,742 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 11:00:49,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:50,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:50,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:00:50,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:50,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:50,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 11:00:51,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:51,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:51,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:52,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:00:52,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:52,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:53,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:00:53,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:53,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:00:54,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:54,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:54,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:00:54,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:54,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 11:00:55,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:00:55,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:55,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:55,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:55,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:56,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:00:56,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:00:56,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 11:00:56,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:56,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:56,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 11:00:57,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:57,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:58,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:58,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:58,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:00:58,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:00:58,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:58,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:58,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 11:00:59,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:00:59,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 11:01:00,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 11:01:00,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:00,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:00,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:00,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:00,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:01,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 11:01:01,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:02,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 11:01:02,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:03,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:01:03,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:03,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:01:03,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 11:01:03,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:01:04,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:04,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:04,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:04,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:01:04,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 11:01:05,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:05,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:05,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:05,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 11:01:05,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:01:06,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 11:01:06,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:01:06,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 11:01:07,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 11:01:07,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:07,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:01:07,327 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 11:01:07,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 11:01:07,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 11:01:07,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:08,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:09,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:09,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:09,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 11:01:10,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 11:01:10,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:01:10,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:11,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 11:01:11,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:11,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:11,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 11:01:12,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:12,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 11:01:13,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:13,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 11:01:13,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:14,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:14,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:14,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 11:01:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:01:15,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:15,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:16,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:16,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:01:16,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:01:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:17,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:17,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:17,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:01:17,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:17,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:01:17,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 11:01:17,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 11:01:18,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:18,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:18,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 11:01:18,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 11:01:18,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 11:01:18,438 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 11:01:18,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 11:01:18,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:18,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:18,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:19,085 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 11:01:19,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:19,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:01:19,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:01:19,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:20,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:20,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:20,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 11:01:21,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:21,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:21,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:21,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:21,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:21,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 11:01:22,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:22,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:01:22,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:23,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:01:23,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:23,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:23,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:24,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 11:01:24,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:01:25,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:25,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:25,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:25,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:01:25,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:25,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:25,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 11:01:26,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:26,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:26,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:26,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:27,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:01:27,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 11:01:27,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:27,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:27,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 11:01:27,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:27,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:01:28,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:01:28,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:28,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:01:29,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 11:01:29,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:30,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:32,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:01:32,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:32,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:32,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 11:01:33,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:01:33,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:33,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:01:33,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:01:33,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 11:01:33,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:33,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:33,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 11:01:33,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:33,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 11:01:34,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:34,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:34,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:35,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:01:35,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 11:01:35,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:35,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:36,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:36,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:36,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:38,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:01:38,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 11:01:38,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:38,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:38,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:38,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:01:38,942 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 11:01:38,943 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 11:01:38,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 11:01:39,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:39,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 11:01:39,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 11:01:39,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:39,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 11:01:40,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 11:01:40,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:40,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:40,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:41,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:41,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 11:01:41,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:41,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 11:01:42,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:01:42,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:42,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:42,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:01:42,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:42,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:42,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:43,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:01:43,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 11:01:43,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:43,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:43,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 11:01:45,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:46,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:46,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:46,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:46,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:01:47,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:47,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:47,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:01:47,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:47,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:01:47,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:48,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:48,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:48,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:48,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 11:01:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:48,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:49,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:01:49,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:01:49,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:50,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:50,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:50,784 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 11:01:51,195 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 11:01:51,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:01:51,278 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 11:01:52,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 11:01:52,092 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 11:01:52,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:52,498 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 11:01:52,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 11:01:52,866 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 11:01:53,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:01:53,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 11:01:53,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 11:01:53,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:01:53,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 11:01:54,055 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 11:01:54,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 11:01:55,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:55,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:55,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:55,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 11:01:55,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:56,077 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 11:01:56,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:56,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:56,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 11:01:57,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:57,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:57,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 11:01:57,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:57,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:57,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:58,402 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 11:01:58,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:59,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:59,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:59,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:01:59,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 11:02:00,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:00,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:00,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:01,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 11:02:01,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:01,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:02:02,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 11:02:02,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:02,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:02:02,305 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 11:02:02,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:02,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:02,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:02:03,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:03,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:03,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 11:02:03,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:02:03,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:03,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 11:02:04,136 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 11:02:04,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:04,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 11:02:04,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:04,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 11:02:05,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:05,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:02:05,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:06,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:06,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 11:02:06,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 11:02:07,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:07,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 11:02:07,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:07,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:07,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:02:07,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:08,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:08,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:08,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:08,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:08,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:08,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:09,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:09,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:09,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:09,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:09,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:02:09,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:10,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:02:10,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:10,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 11:02:10,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:10,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:10,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:11,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:11,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:02:11,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:11,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:11,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 11:02:11,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:12,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:02:13,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:13,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:13,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:13,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:13,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:13,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:02:13,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:02:13,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 11:02:13,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:13,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:13,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:14,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:14,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:02:14,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 11:02:14,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:15,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:02:15,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:16,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:16,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:16,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:16,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:16,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:16,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:16,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:16,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:17,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:17,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:02:18,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:18,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:19,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:02:19,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:20,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:20,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:20,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:20,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:20,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:20,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:21,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:21,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:21,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:21,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 11:02:21,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:22,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:22,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 11:02:22,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 11:02:22,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:22,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:22,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:23,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:23,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:23,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:23,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:23,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:02:23,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:23,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:23,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 11:02:23,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:23,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:24,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 11:02:24,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:24,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:24,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:25,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:02:25,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:25,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:25,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:25,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:25,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:25,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:02:26,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:26,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:26,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:27,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:28,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:02:28,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:28,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:28,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:28,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:02:29,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:29,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:29,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 11:02:30,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:30,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 11:02:30,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:02:31,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:31,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 11:02:31,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:31,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:31,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 11:02:31,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:02:32,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:02:32,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:32,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:32,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 11:02:32,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:32,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:32,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:32,805 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 11:02:32,806 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 11:02:33,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:34,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:34,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:34,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:34,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 11:02:35,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:02:35,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 11:02:35,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:35,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:35,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:36,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:36,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:36,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:36,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:37,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:02:37,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:37,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:37,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:37,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:38,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:38,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 11:02:38,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:38,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:38,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:39,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:39,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:39,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:40,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:40,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:40,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:02:40,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:02:41,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:02:41,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:41,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 11:02:41,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:41,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:41,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:41,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 11:02:41,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:41,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:41,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:42,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 11:02:42,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:42,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:02:43,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:43,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:43,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:43,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:43,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:44,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:44,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:44,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:44,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 11:02:45,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 11:02:45,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:45,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 11:02:45,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:46,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 11:02:46,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 11:02:46,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:48,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:48,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:48,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:48,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:48,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:48,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:49,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:02:49,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 11:02:49,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:49,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:49,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:49,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:50,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:50,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:50,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:50,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:50,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:50,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:51,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:51,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:02:51,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:52,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 11:02:52,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 11:02:52,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:02:52,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:52,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 11:02:52,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:52,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:52,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:52,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:02:52,822 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 11:02:52,889 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 11:02:52,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:02:52,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:53,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:53,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:53,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:53,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 11:02:54,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:55,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 11:02:55,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 11:02:55,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:55,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:55,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:55,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:56,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:02:56,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:56,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:02:56,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 11:02:56,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:02:57,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:57,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 11:02:57,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 11:02:57,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:57,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 11:02:57,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:58,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:58,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:58,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:58,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:59,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:59,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:59,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 11:02:59,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 11:02:59,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:00,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:00,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 11:03:00,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:03:02,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:03,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:03:03,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:03:03,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 11:03:03,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:03,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 11:03:03,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:03,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:04,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:04,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 11:03:05,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:05,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:05,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:05,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:05,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 11:03:05,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 11:03:05,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:03:05,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:06,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:03:06,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:07,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:07,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:07,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:07,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:07,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:07,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:07,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:08,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 11:03:09,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 11:03:09,547 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 11:03:09,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:09,880 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 11:03:10,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 11:03:10,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:10,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:03:10,195 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 11:03:10,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:03:10,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 11:03:10,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:10,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:03:11,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:11,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:03:11,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:11,363 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 11:03:11,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:11,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 11:03:12,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:12,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:12,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 11:03:12,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:12,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 11:03:13,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:13,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:13,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:13,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:13,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:03:13,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:13,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:13,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:13,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:03:14,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:14,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:14,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:14,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 11:03:14,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:14,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:15,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:03:15,456 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 11:03:15,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 11:03:16,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:16,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:16,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 11:03:16,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:17,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:03:18,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:19,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 11:03:19,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:03:19,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:19,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:20,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:20,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:20,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 11:03:20,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 11:03:20,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:20,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:03:21,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:21,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:21,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:21,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:21,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:03:21,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:21,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:03:22,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:22,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 11:03:22,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:03:23,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:23,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:03:23,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:23,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:23,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:03:24,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 11:03:24,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:24,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 11:03:24,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:24,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 11:03:24,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:24,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:03:25,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 11:03:25,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 11:03:25,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:03:25,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:25,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:25,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:03:25,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:25,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:25,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 11:03:26,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:26,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:26,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:26,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:27,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 11:03:28,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 11:03:28,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 11:03:28,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:28,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:03:29,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:29,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:29,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:29,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:30,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:30,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:30,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:30,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:30,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:30,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:31,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:31,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 11:03:31,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:31,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:31,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:31,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:03:31,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:32,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:32,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:32,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:33,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:33,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:33,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:33,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:33,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:03:33,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:34,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 11:03:34,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:34,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:34,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 11:03:34,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:35,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:35,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:35,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:03:36,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 11:03:36,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 11:03:37,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 11:03:37,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:37,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:37,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:38,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:03:38,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:39,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 11:03:39,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:03:39,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:40,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:40,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:40,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:03:40,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:40,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 11:03:40,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:40,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:40,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 11:03:40,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:41,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:03:41,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 11:03:41,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 11:03:41,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:41,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:42,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:42,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:42,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:42,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:03:42,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:42,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:42,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:42,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:42,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:03:43,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:44,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 11:03:44,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:44,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 11:03:44,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:44,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:44,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 11:03:45,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 11:03:46,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:46,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:46,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:46,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:03:46,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 11:03:46,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:46,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:03:47,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 11:03:47,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:47,571 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 11:03:47,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 11:03:48,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:48,166 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 11:03:48,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:03:48,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 11:03:48,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 11:03:48,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 11:03:48,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:48,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:48,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:48,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 11:03:49,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:49,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:49,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:49,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:03:50,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 11:03:50,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:03:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:51,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:51,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 11:03:51,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 11:03:51,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:51,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:03:51,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:03:51,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:51,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:03:52,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:03:52,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:03:52,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 11:03:52,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:03:52,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:52,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:52,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:52,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 11:03:52,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:53,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 11:03:53,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:53,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 11:03:53,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 11:03:53,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:53,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:53,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 11:03:54,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:03:54,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:54,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:54,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:54,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:55,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:55,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:55,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 11:03:56,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:56,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:03:56,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:57,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:57,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 11:03:58,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:58,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:03:59,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:00,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:04:01,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 11:04:01,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:01,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 11:04:01,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:04:02,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:02,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:04:02,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:03,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 11:04:03,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:04:03,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 11:04:03,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 11:04:04,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:04:05,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:05,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:04:05,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:05,725 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 11:04:05,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:04:06,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:06,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 11:04:06,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 11:04:06,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 11:04:06,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 11:04:07,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:04:07,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:04:07,358 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 11:04:07,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:07,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:07,587 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 11:04:08,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:04:08,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:09,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:09,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 11:04:09,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:09,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:09,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:09,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:09,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:04:10,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:10,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:04:10,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:10,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:10,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:10,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:10,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:11,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:11,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:11,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:12,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:12,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:12,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:12,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 11:04:12,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:12,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:04:13,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:13,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:04:14,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:14,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:15,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:15,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 11:04:15,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:04:15,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:04:15,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:15,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 11:04:15,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 11:04:15,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:15,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:15,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:15,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:04:16,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:16,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:16,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:16,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 11:04:16,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:17,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:04:17,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 11:04:17,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:17,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 11:04:17,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 11:04:17,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 11:04:17,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:18,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:18,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:19,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:19,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:04:19,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:04:19,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:19,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:20,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 11:04:20,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:20,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:20,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:20,989 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 11:04:21,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:21,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:21,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:04:21,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:21,417 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 11:04:21,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:21,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:04:22,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:22,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 11:04:22,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:04:22,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:22,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:04:22,816 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 11:04:23,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 11:04:23,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:23,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 11:04:23,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:24,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:04:24,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:24,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:24,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:24,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:24,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:24,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:04:24,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:24,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:25,173 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 11:04:25,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 11:04:25,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:04:26,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:26,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:26,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:26,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:26,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:26,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:26,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:04:26,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:27,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:04:27,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 11:04:27,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:27,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:27,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:28,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:28,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:28,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:28,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:28,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:29,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:29,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:29,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:29,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:29,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:30,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:30,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 11:04:30,400 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 11:04:30,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:30,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 11:04:30,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 11:04:31,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:04:31,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:31,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:31,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 11:04:31,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:31,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:04:31,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:31,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:31,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:32,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:32,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:33,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:33,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:34,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:34,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:34,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:34,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:34,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:35,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 11:04:35,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:04:35,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 11:04:35,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:35,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 11:04:35,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:35,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:36,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:36,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 11:04:36,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:36,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:37,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:37,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:38,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 11:04:38,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:38,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:04:38,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:38,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 11:04:38,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:38,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 11:04:38,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:38,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:38,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:04:39,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:39,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 11:04:39,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 11:04:39,717 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 11:04:39,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:40,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:40,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:40,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:41,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:41,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:41,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 11:04:42,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:04:42,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:42,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:43,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:04:43,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:44,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 11:04:45,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:45,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:45,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 11:04:45,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:45,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:45,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:45,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:45,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:46,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:04:46,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:47,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:47,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 11:04:48,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:48,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 11:04:49,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 11:04:49,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:49,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:04:49,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 11:04:49,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:50,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:04:51,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:04:51,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:51,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:04:51,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:51,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:52,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 11:04:52,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 11:04:53,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:04:53,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:04:53,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:54,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 11:04:54,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:04:55,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:55,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:55,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:04:55,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:55,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 11:04:55,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:56,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:56,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:56,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 11:04:57,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:04:57,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:57,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:58,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:58,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:58,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:58,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:04:59,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:59,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:59,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:05:00,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 11:05:00,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:05:01,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:05:01,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:01,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 11:05:02,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:02,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:02,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:02,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:02,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:05:02,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:02,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:02,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 11:05:03,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:03,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:05:03,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:04,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:04,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 11:05:04,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:04,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:04,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:05:04,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:05,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:05,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:05,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 11:05:05,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 11:05:05,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 11:05:05,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:05,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:06,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:06,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:05:06,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:05:06,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:05:07,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:07,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 11:05:07,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 11:05:07,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:08,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:08,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:08,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:09,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 11:05:09,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:09,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:09,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 11:05:09,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 11:05:09,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:10,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:10,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:10,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:10,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:05:11,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:12,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:05:12,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:12,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:05:12,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:12,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:12,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:05:13,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:13,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:13,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:13,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:05:13,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:05:14,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:05:14,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:14,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:14,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:14,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:15,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 11:05:15,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:15,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:15,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:05:16,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:16,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:16,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:16,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 11:05:16,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:17,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 11:05:17,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:17,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:05:17,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:17,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 11:05:18,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:18,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:18,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 11:05:19,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:19,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:19,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 11:05:20,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 11:05:20,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:20,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:05:20,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:20,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:21,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:21,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:22,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:22,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:22,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:05:22,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:22,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 11:05:23,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:23,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:23,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:24,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:24,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:05:24,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:24,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 11:05:24,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:24,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:24,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:05:25,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:25,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:25,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:25,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 11:05:26,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:26,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:26,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 11:05:27,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:28,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:28,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:29,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:05:29,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:05:29,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 11:05:30,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 11:05:30,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 11:05:30,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:30,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:30,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 11:05:30,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:30,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:30,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:31,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 11:05:31,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 11:05:31,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:31,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 11:05:32,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 11:05:32,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:32,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 11:05:33,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 11:05:33,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:33,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:05:33,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:05:34,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:05:34,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:34,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 11:05:34,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:05:34,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:34,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 11:05:34,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:34,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:34,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:35,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:35,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 11:05:35,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 11:05:35,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:36,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 11:05:36,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:36,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:37,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:05:37,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:37,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:05:37,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:38,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:05:38,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:38,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:38,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:38,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:39,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:05:39,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:39,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:40,158 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 11:05:40,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:40,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:40,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:05:40,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:40,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:05:41,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:41,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 11:05:41,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:41,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:05:41,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:42,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:05:42,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:42,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 11:05:42,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:42,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:05:42,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:43,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:44,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:44,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:44,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:44,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:44,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:44,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:44,928 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 11:05:45,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:45,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:05:46,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:05:46,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 11:05:46,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:46,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:46,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 11:05:46,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:47,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:47,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:47,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:05:47,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:05:48,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:48,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 11:05:48,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:48,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 11:05:49,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:49,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:49,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:49,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 11:05:49,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:49,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:05:50,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:50,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:51,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:51,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 11:05:51,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 11:05:51,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:51,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:51,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:51,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:05:52,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:52,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:52,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:52,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 11:05:52,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:05:53,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:05:53,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 11:05:53,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:53,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:53,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:53,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:54,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:54,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:54,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:05:55,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:55,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:55,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 11:05:55,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:55,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:55,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:56,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 11:05:57,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 11:05:57,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:57,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:57,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:58,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:58,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:05:59,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 11:05:59,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:59,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:59,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:00,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:06:00,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:06:01,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:06:01,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:06:01,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:01,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:06:02,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:03,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:03,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:06:03,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 11:06:04,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:04,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:04,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:06:05,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:06:05,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:05,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:06:05,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:05,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:06:05,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:05,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 11:06:06,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:06:06,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:06,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:06,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:06:06,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:07,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:06:07,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:07,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:07,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:08,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:06:08,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 11:06:08,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:09,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:09,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:10,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 11:06:10,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 11:06:10,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:11,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:11,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:11,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 11:06:12,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 11:06:12,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 11:06:12,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:12,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:13,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:13,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:06:13,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:06:14,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 11:06:14,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:14,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:14,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:06:15,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:15,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:06:15,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 11:06:16,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:16,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:16,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:16,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:06:17,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:17,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:17,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:17,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:06:18,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:18,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:18,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:18,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:06:18,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 11:06:18,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 11:06:19,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:19,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:19,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:19,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:19,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 11:06:19,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 11:06:20,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:20,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 11:06:20,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:06:21,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:21,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:22,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:22,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 11:06:22,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 11:06:22,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:22,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:23,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:06:23,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:06:23,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:23,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:23,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:23,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 11:06:23,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:23,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 11:06:23,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:23,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:24,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:06:24,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:24,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:24,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:24,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:24,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:24,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 11:06:24,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:25,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:25,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:25,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:25,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:26,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:26,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:26,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:26,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 11:06:26,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:26,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 11:06:27,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:27,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 11:06:27,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 11:06:27,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:28,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:28,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:28,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:28,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:29,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:29,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:06:29,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:29,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:30,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:30,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:31,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:06:31,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:33,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:33,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:33,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 11:06:33,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 11:06:33,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:34,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 11:06:34,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:34,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 11:06:35,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:35,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 11:06:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:35,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:36,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:36,769 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 11:06:36,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:36,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 11:06:37,011 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 11:06:37,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:37,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:37,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:37,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:37,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 11:06:38,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:38,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:38,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:38,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:06:38,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:40,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:40,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:06:41,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 11:06:41,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 11:06:41,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 11:06:41,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:42,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:43,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:06:43,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:43,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:43,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:43,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 11:06:43,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:44,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:44,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 11:06:45,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:47,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:47,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:47,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:48,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:48,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 11:06:48,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:06:48,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 11:06:48,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:06:48,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 11:06:48,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:48,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:49,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:49,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:49,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:49,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:06:49,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:06:49,520 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 11:06:49,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:06:49,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:50,080 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 11:06:50,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:06:50,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:51,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 11:06:51,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:51,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:06:51,531 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 11:06:51,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:06:51,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 11:06:51,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:51,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:52,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:52,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:52,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:06:52,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:52,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 11:06:52,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:52,908 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 11:06:54,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:06:54,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:06:54,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:06:54,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:54,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:55,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:55,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:55,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:06:56,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 11:06:56,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:06:56,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:56,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:06:56,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:06:56,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:57,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:57,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:06:57,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:06:57,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:06:58,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:58,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:58,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:06:59,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:06:59,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 11:06:59,388 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 11:06:59,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:01,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 11:07:01,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:01,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:02,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:02,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:02,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:02,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:02,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 11:07:02,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:07:03,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:03,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 11:07:03,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:04,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 11:07:04,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:04,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:05,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 11:07:05,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 11:07:05,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:05,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:05,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:05,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:06,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 11:07:06,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 11:07:07,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 11:07:07,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 11:07:07,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:08,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:08,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:08,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:08,239 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 11:07:08,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:08,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:08,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:08,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:07:09,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:07:09,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:09,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:09,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 11:07:09,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:09,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:09,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:09,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:09,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 11:07:10,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:10,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 11:07:10,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:10,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:10,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 11:07:11,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:11,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:07:11,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:07:11,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 11:07:11,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:07:11,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:07:12,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 11:07:12,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:12,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:12,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:13,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:13,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:13,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:15,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:15,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:15,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:16,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:16,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:16,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:16,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:16,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:17,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 11:07:17,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:17,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 11:07:17,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 11:07:17,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 11:07:17,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:18,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:18,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:18,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:18,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:19,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:07:19,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:07:19,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:19,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:07:20,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:20,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:20,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 11:07:21,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 11:07:21,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:21,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 11:07:21,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:21,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:22,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:22,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:22,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 11:07:23,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:23,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:23,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 11:07:23,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:23,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 11:07:23,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:24,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:24,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:24,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 11:07:24,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:24,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:07:24,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:07:24,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 11:07:25,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:25,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:07:25,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:07:25,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 11:07:25,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:25,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:07:25,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:25,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:25,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 11:07:25,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:26,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:26,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 11:07:26,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:27,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:27,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:07:27,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:27,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:28,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 11:07:28,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 11:07:28,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:29,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:29,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:29,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:07:29,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:30,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:30,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 11:07:30,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:30,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:30,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:30,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:30,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:07:30,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 11:07:30,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:31,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:07:31,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:07:31,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:32,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:32,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:32,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 11:07:32,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:32,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:33,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:07:33,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:34,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:35,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 11:07:35,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:36,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:07:36,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:36,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 11:07:37,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:07:37,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:37,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:07:37,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:38,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:07:38,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 11:07:38,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:39,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:39,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:40,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:40,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:40,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:07:40,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:40,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:40,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:40,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:41,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:41,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:42,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 11:07:42,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 11:07:42,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:42,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:42,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:42,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:07:42,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:42,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:43,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:07:43,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:44,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:44,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:44,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 11:07:44,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:07:44,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 11:07:44,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:44,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:07:45,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:45,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:45,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 11:07:45,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:07:45,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:07:46,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:46,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:46,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:46,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:47,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:47,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:47,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:47,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 11:07:47,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:48,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:49,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:49,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:50,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:50,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 11:07:50,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:07:50,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:07:50,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:50,988 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 11:07:51,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:07:51,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:51,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 11:07:52,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:07:52,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 11:07:52,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:07:52,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:52,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:52,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:53,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:07:53,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:53,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:53,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 11:07:53,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 11:07:53,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:54,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:54,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:07:54,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:54,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:54,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 11:07:54,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 11:07:54,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 11:07:54,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:54,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 11:07:54,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 11:07:55,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:55,390 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 11:07:56,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:56,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:56,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:56,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 11:07:56,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:56,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:56,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:56,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:57,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:07:57,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:57,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:57,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:57,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:58,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 11:07:58,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:07:59,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:59,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:59,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:07:59,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:00,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:08:01,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:01,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:08:01,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:01,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:08:02,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:02,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:08:03,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 11:08:03,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:03,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:04,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:04,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 11:08:05,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:05,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:08:05,915 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 11:08:06,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:06,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:08:06,326 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 11:08:06,421 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 11:08:06,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:06,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:06,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:06,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:06,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:06,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:07,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 11:08:07,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:07,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:07,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:07,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 11:08:07,528 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 11:08:07,535 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 11:08:07,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 11:08:07,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:07,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:08,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:08,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:08,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 11:08:08,692 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 11:08:08,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:09,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:09,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:09,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:10,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 11:08:10,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 11:08:10,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 11:08:10,558 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 11:08:10,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:08:10,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:11,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 11:08:11,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:11,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:11,412 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 11:08:11,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:11,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 11:08:11,876 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 11:08:12,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 11:08:12,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 11:08:12,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 11:08:12,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:12,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:12,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:12,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:12,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 11:08:13,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 11:08:13,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:13,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:08:13,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:13,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:13,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:13,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 11:08:13,807 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 11:08:14,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:15,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:15,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 11:08:15,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:16,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:16,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:08:17,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 11:08:17,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:08:17,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:17,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:08:17,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:17,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 11:08:18,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 11:08:18,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 11:08:18,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:18,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 11:08:18,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:08:18,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:19,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 11:08:19,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:19,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:19,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:08:20,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:20,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:21,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:21,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:21,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:21,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:22,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 11:08:22,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:22,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:22,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:22,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:22,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:08:23,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:23,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:23,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:24,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:24,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:24,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:08:24,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 11:08:25,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 11:08:25,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:08:25,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:25,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 11:08:25,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:26,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:26,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 11:08:26,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:26,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:26,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:08:26,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:26,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:27,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:08:27,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 11:08:27,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:08:27,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:27,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:28,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:28,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:08:28,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:08:28,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 11:08:29,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:29,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:29,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 11:08:29,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 11:08:30,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:31,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:31,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:31,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:08:31,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:31,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:31,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:33,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:33,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:33,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:33,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:08:33,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:34,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:08:35,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:08:35,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:35,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:36,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 11:08:36,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:36,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:36,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:36,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:36,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:36,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 11:08:36,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:08:36,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:38,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:08:38,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:38,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:38,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:08:38,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:38,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 11:08:39,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 11:08:39,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 11:08:39,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 11:08:39,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 11:08:40,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:40,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:40,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:40,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:41,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:41,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:42,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:42,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:42,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:42,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:42,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:42,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:43,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 11:08:43,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 11:08:43,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:08:43,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 11:08:44,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 11:08:44,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:45,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 11:08:45,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:45,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:45,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:45,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:08:46,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 11:08:46,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:46,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:46,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:47,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:47,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 11:08:47,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 11:08:47,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:47,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 11:08:47,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 11:08:47,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:47,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:47,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:48,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:48,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:08:48,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:08:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 11:08:48,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:48,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:08:48,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 11:08:48,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:48,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 11:08:49,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:08:49,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:49,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:49,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:49,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:08:49,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:50,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:08:50,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:50,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:51,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:08:51,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:08:51,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:52,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 11:08:52,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:52,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:08:52,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:53,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:54,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 11:08:54,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:54,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:54,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:54,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:55,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 11:08:55,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:08:55,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:56,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:56,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:08:56,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:08:57,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 11:08:57,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:08:57,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:58,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:58,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:58,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:08:59,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:59,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 11:08:59,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:59,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:59,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:59,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:59,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:59,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 11:08:59,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 11:08:59,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 11:08:59,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:00,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:00,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:00,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:01,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:09:02,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:02,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:02,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:02,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:02,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:02,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:02,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 11:09:03,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:03,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 11:09:03,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:03,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 11:09:03,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:09:03,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 11:09:03,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 11:09:03,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:03,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:09:04,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:09:04,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:04,575 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 11:09:04,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:05,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 11:09:06,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:06,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:06,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 11:09:06,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:06,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:09:07,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:07,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:08,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:08,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:08,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:09:08,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 11:09:08,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 11:09:09,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:09:09,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 11:09:09,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:10,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:10,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:10,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 11:09:10,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:10,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:10,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:10,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:09:10,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:09:10,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:11,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:11,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:11,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:11,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:09:11,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:09:11,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:09:11,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 11:09:13,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:13,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:13,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:13,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:14,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:09:14,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 11:09:14,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 11:09:15,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:15,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:15,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:15,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:16,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:16,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:09:17,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:17,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 11:09:17,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:09:18,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:18,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 11:09:18,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:20,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:20,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 11:09:20,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:20,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:20,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:20,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:20,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 11:09:20,960 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 11:09:21,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:21,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:21,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:21,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 11:09:21,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:22,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 11:09:22,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:22,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:23,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:23,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:23,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 11:09:23,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:23,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 11:09:24,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:24,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:24,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:24,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 11:09:25,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:09:25,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 11:09:25,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:26,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:26,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:26,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:27,028 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 11:09:27,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 11:09:27,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 11:09:28,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:28,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:28,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:28,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 11:09:28,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:29,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:29,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:29,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 11:09:29,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 11:09:29,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:29,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:29,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:30,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:09:30,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 11:09:30,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 11:09:30,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:31,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:31,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 11:09:31,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:31,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:31,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:09:31,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:31,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:32,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:32,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 11:09:32,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 11:09:32,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 11:09:32,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:09:32,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:33,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 11:09:34,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:34,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:35,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:09:35,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:35,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:35,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 11:09:35,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:35,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:09:35,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:09:36,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:36,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:36,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:09:36,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:36,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:09:36,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:36,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:37,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:37,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:37,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:37,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:38,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:09:38,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:38,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 11:09:38,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 11:09:38,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:38,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:38,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:38,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:38,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:09:39,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:39,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:39,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:39,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:40,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:41,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 11:09:41,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:41,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:41,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:41,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:42,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:42,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:09:42,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:42,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:42,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:42,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:09:42,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:42,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:42,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:43,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 11:09:43,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 11:09:43,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:43,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:43,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:43,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:43,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:44,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 11:09:45,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:45,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:45,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:09:45,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:46,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:46,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:09:46,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:09:46,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 11:09:47,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:47,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:09:47,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:09:47,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:47,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 11:09:48,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:48,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:48,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:48,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 11:09:49,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 11:09:49,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:09:50,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:50,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 11:09:50,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:50,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:09:50,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:09:50,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 11:09:50,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:50,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 11:09:50,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:51,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:51,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 11:09:52,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:52,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:09:53,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:53,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 11:09:53,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:54,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:55,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:55,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:09:55,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 11:09:56,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 11:09:56,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 11:09:56,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 11:09:56,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:56,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:56,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:56,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:56,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:57,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 11:09:57,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 11:09:57,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:09:57,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:09:57,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:58,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:58,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:58,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:58,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 11:09:58,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:09:58,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:58,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 11:09:58,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 11:09:59,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 11:09:59,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:09:59,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:09:59,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:59,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:59,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:09:59,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:09:59,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 11:10:00,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:00,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:10:00,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:01,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 11:10:01,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:01,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:10:02,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 11:10:02,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:03,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:03,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 11:10:03,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:10:03,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:10:03,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:03,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:03,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:10:04,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:04,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:10:04,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:10:04,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:04,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:10:05,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 11:10:05,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 11:10:05,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:10:05,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 11:10:05,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:05,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:05,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:10:05,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:05,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:05,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:05,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:06,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:06,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:06,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:06,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:10:07,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:07,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:07,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:08,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:08,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 11:10:09,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 11:10:09,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:09,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:10:09,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:09,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:09,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:09,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:10:10,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:10,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:10:10,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:10,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:11,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:11,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 11:10:11,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:11,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:10:11,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:12,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:12,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:12,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:12,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:10:12,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:10:12,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:12,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:10:12,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:12,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 11:10:13,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:14,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 11:10:14,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:10:15,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:15,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:15,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 11:10:16,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 11:10:16,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:16,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 11:10:16,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:16,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:10:16,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 11:10:16,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:16,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:17,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 11:10:17,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:17,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:17,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 11:10:17,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 11:10:17,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:17,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 11:10:17,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:17,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:17,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:17,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 11:10:17,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 11:10:18,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 11:10:18,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:18,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:18,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 11:10:18,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:18,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:18,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:18,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:10:19,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 11:10:19,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:19,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:19,963 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 11:10:20,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:20,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:20,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:20,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 11:10:21,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:21,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:21,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:21,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:10:21,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:21,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:21,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:23,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 11:10:23,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:24,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:25,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:25,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:25,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:25,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:10:25,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:25,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:25,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:25,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 11:10:26,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:26,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:26,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:26,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 11:10:26,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:26,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:27,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:27,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:10:27,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:10:28,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 11:10:28,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:10:28,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:28,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 11:10:28,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:29,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:29,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:29,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:29,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 11:10:29,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 11:10:30,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:30,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:30,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:30,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 11:10:31,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:31,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 11:10:31,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:31,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:31,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:31,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:32,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:32,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:32,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:32,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:32,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:32,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 11:10:32,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:33,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:33,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:33,529 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 11:10:33,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:10:33,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:33,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:33,866 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 11:10:34,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:34,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 11:10:34,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:34,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:34,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:34,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 11:10:35,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 11:10:35,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:35,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:35,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:35,720 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 11:10:36,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:36,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 11:10:37,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 11:10:37,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:37,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:37,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:37,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 11:10:37,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 11:10:38,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:38,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:38,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:39,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 11:10:39,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:39,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:39,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:10:39,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:40,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:40,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 11:10:40,385 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 11:10:40,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:40,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 11:10:40,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 11:10:41,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:42,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:42,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 11:10:42,381 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 11:10:42,404 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 11:10:42,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 11:10:42,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:43,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 11:10:44,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 11:10:44,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:10:44,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:44,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 11:10:45,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:10:45,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 11:10:45,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:45,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:45,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:45,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:45,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:10:46,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:46,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 11:10:46,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 11:10:46,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 11:10:46,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:46,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 11:10:46,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:46,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:10:46,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:47,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:47,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:10:47,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 11:10:47,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:47,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:48,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:10:48,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:48,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:48,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:10:48,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:48,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 11:10:48,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:10:48,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:10:48,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:49,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 11:10:49,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:10:50,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:10:50,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 11:10:51,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:52,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:52,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:52,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:52,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:53,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 11:10:53,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:53,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:10:53,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:10:54,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:54,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:54,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 11:10:55,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:55,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:10:55,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:55,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:10:55,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:10:55,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:10:55,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:55,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:56,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:56,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:10:56,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:57,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 11:10:57,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:57,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:58,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:58,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:10:58,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:58,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 11:10:58,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:10:58,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:59,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 11:10:59,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:10:59,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:10:59,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 11:10:59,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 11:11:00,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 11:11:00,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:00,311 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 11:11:00,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:11:00,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:00,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:11:00,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 11:11:00,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:11:01,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:01,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 11:11:01,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 11:11:01,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 11:11:02,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 11:11:02,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:11:02,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:11:02,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:03,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 11:11:03,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:11:03,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:11:03,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:03,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:11:03,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:11:03,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:11:04,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:05,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:05,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:05,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:05,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 11:11:05,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:11:05,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:11:06,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:06,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:11:06,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:11:06,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:11:06,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:07,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:11:07,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:11:07,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:08,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:11:08,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:11:08,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:11:08,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 11:11:08,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:08,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:28,541 INFO [train.py:1379] (0/4) Maximum memory allocated so far is 19147MB 2023-09-28 11:11:31,715 INFO [train.py:1379] (0/4) Maximum memory allocated so far is 19147MB 2023-09-28 11:11:36,034 INFO [train.py:1379] (0/4) Maximum memory allocated so far is 19147MB 2023-09-28 11:11:39,572 INFO [train.py:1379] (0/4) Maximum memory allocated so far is 19147MB 2023-09-28 11:11:51,707 INFO [train.py:1379] (0/4) Maximum memory allocated so far is 19147MB 2023-09-28 11:11:57,920 INFO [scaling.py:1022] (0/4) Whitening: name=None, num_groups=4, num_channels=128, metric=10.28 vs. limit=3.0 2023-09-28 11:11:58,982 INFO [train.py:1379] (0/4) Maximum memory allocated so far is 19147MB 2023-09-28 11:12:16,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:12:16,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 11:12:16,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 11:12:16,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:17,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:12:17,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:17,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:12:18,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:12:18,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 11:12:18,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 11:12:18,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 11:12:19,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:12:19,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 11:12:19,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 11:12:19,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:19,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:19,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:20,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:20,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:12:20,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:20,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:20,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:20,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:20,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:20,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:12:20,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:20,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:12:22,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 11:12:22,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:22,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:22,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 11:12:22,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 11:12:23,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:12:23,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:23,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 11:12:23,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 11:12:23,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:24,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:12:24,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:24,591 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 11:12:24,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 11:12:24,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:12:24,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:25,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 11:12:25,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 11:12:25,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 11:12:25,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:30,496 INFO [train.py:1039] (0/4) Epoch 1, batch 0, loss[loss=9.345, simple_loss=8.487, pruned_loss=8.565, over 24463.00 frames. ], tot_loss[loss=9.345, simple_loss=8.487, pruned_loss=8.565, over 24463.00 frames. ], batch size: 58, lr: 2.25e-02, grad_scale: 1.0 2023-09-28 11:12:30,497 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 11:12:44,941 INFO [train.py:1071] (0/4) Epoch 1, validation: loss=9.318, simple_loss=8.466, pruned_loss=8.496, over 1125622.00 frames. 2023-09-28 11:12:44,942 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 19147MB 2023-09-28 11:12:48,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=0.0, ans=0.2 2023-09-28 11:12:49,204 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.55 vs. limit=7.5 2023-09-28 11:12:50,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 11:12:50,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:12:51,094 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.37 vs. limit=7.5 2023-09-28 11:12:52,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:12:55,551 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=0.0, ans=0.75 2023-09-28 11:12:58,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:58,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:12:58,919 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=0.0, ans=0.5 2023-09-28 11:13:01,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:01,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 11:13:02,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 11:13:06,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:06,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:10,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:10,797 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=121.20 vs. limit=5.033333333333333 2023-09-28 11:13:11,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:11,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:13:11,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:13,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 11:13:16,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:26,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:13:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:28,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 11:13:33,025 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.49 vs. limit=3.02 2023-09-28 11:13:33,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:13:33,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:13:34,553 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=372.39 vs. limit=7.6 2023-09-28 11:13:36,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:43,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:13:47,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:54,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 11:13:57,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 11:13:57,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:13:57,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:13:59,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:13:59,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:02,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 11:14:04,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:04,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:04,883 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=75.31 vs. limit=7.7 2023-09-28 11:14:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:14:08,272 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=181.80 vs. limit=7.7 2023-09-28 11:14:10,204 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=8.77 vs. limit=4.1066666666666665 2023-09-28 11:14:10,237 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=428.45 vs. limit=7.6 2023-09-28 11:14:11,244 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 11:14:14,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:14:15,952 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=181.30 vs. limit=4.066666666666666 2023-09-28 11:14:16,973 INFO [train.py:1039] (0/4) Epoch 1, batch 50, loss[loss=1.324, simple_loss=1.188, pruned_loss=1.235, over 23706.00 frames. ], tot_loss[loss=3.826, simple_loss=3.522, pruned_loss=2.981, over 1072266.74 frames. ], batch size: 232, lr: 2.48e-02, grad_scale: 0.25 2023-09-28 11:14:19,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:19,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=333.3333333333333, ans=0.484375 2023-09-28 11:14:21,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:14:22,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 11:14:22,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:14:22,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:14:26,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:26,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:29,208 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=130.35 vs. limit=7.625 2023-09-28 11:14:31,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:34,309 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=32.79 vs. limit=7.65 2023-09-28 11:14:35,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 11:14:35,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:36,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=185.37 vs. limit=7.8 2023-09-28 11:14:36,847 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=91.58 vs. limit=7.65 2023-09-28 11:14:38,733 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.47 vs. limit=7.8 2023-09-28 11:14:44,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:14:44,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 11:14:46,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 11:14:49,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:14:51,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:14:51,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:51,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:53,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:14:53,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:14:53,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:15:00,655 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=16.87 vs. limit=4.1866666666666665 2023-09-28 11:15:01,810 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=85.21 vs. limit=7.675 2023-09-28 11:15:02,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:04,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:04,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:15:04,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 11:15:04,959 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=466.6666666666667, ans=0.1825 2023-09-28 11:15:06,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:15:08,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:15:08,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 11:15:09,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:10,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 11:15:10,987 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=18.87 vs. limit=7.7 2023-09-28 11:15:19,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:15:19,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:21,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:21,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=533.3333333333334, ans=0.475 2023-09-28 11:15:23,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:15:23,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:25,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 11:15:25,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 11:15:27,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:27,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:32,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:15:32,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:33,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=19.31 vs. limit=7.725 2023-09-28 11:15:34,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 11:15:34,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 11:15:36,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:15:36,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:38,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:15:40,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 11:15:40,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 11:15:40,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:42,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:42,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:15:43,123 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=11.30 vs. limit=5.15 2023-09-28 11:15:44,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:15:44,426 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=600.0, ans=0.08650000000000001 2023-09-28 11:15:47,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:15:49,918 INFO [train.py:1039] (0/4) Epoch 1, batch 100, loss[loss=1.16, simple_loss=1.007, pruned_loss=1.227, over 23568.00 frames. ], tot_loss[loss=2.419, simple_loss=2.197, pruned_loss=2.061, over 1872512.54 frames. ], batch size: 256, lr: 2.70e-02, grad_scale: 0.5 2023-09-28 11:15:51,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:15:52,247 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=136.65 vs. limit=7.75 2023-09-28 11:15:55,762 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 2.173e+02 3.855e+02 5.319e+03 2.503e+05, threshold=7.710e+02, percent-clipped=0.0 2023-09-28 11:15:55,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:15:57,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 11:15:58,482 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=17.24 vs. limit=7.75 2023-09-28 11:15:59,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:16:02,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:16:03,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=12.70 vs. limit=4.266666666666667 2023-09-28 11:16:04,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:04,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:16:04,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:16:04,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:06,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 11:16:08,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:16:08,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:08,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:08,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:16:09,522 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=112.81 vs. limit=7.775 2023-09-28 11:16:14,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 11:16:14,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=733.3333333333334, ans=0.465625 2023-09-28 11:16:16,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:18,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:18,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:16:18,450 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=733.3333333333334, ans=0.465625 2023-09-28 11:16:20,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:16:24,318 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=330.27 vs. limit=7.775 2023-09-28 11:16:25,344 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 11:16:25,381 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 11:16:27,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:16:27,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:16:30,586 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=7.8 2023-09-28 11:16:31,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:16:35,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:38,245 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.86 vs. limit=8.1 2023-09-28 11:16:39,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:46,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:46,258 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 11:16:49,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:16:54,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:16:54,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:16:55,641 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=38.50 vs. limit=7.825 2023-09-28 11:16:56,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:01,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:05,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:09,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:17:11,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:11,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:11,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff3.min_abs, batch_count=933.3333333333334, ans=0.04666666666666667 2023-09-28 11:17:12,110 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.78 vs. limit=5.233333333333333 2023-09-28 11:17:12,236 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=232.59 vs. limit=7.85 2023-09-28 11:17:13,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:13,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:17:13,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:14,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 11:17:14,821 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 11:17:15,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=933.3333333333334, ans=0.09416666666666668 2023-09-28 11:17:17,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:17,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:17:19,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:19,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:19,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:17:19,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:17:20,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:17:20,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:20,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:22,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:22,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:17:24,261 INFO [train.py:1039] (0/4) Epoch 1, batch 150, loss[loss=0.9776, simple_loss=0.8234, pruned_loss=1.103, over 23353.00 frames. ], tot_loss[loss=1.851, simple_loss=1.657, pruned_loss=1.687, over 2510367.09 frames. ], batch size: 93, lr: 2.93e-02, grad_scale: 0.5 2023-09-28 11:17:24,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:17:25,015 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.68 vs. limit=4.4 2023-09-28 11:17:27,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:28,713 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=14.52 vs. limit=7.875 2023-09-28 11:17:29,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:29,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:17:29,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:36,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:36,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:36,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1000.0, ans=0.453125 2023-09-28 11:17:41,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:17:41,603 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=1066.6666666666667, ans=0.04666666666666667 2023-09-28 11:17:43,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:47,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 11:17:47,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 11:17:47,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 11:17:50,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:17:50,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:17:53,369 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=20.88 vs. limit=7.9 2023-09-28 11:17:53,666 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=105.39 vs. limit=7.9 2023-09-28 11:17:54,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:17:54,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:54,883 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.85 vs. limit=8.3 2023-09-28 11:17:55,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:55,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:56,850 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=32.13 vs. limit=7.9 2023-09-28 11:17:57,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:57,862 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 11:17:58,754 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.10 vs. limit=8.3 2023-09-28 11:18:01,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:07,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:09,613 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.98 vs. limit=5.283333333333333 2023-09-28 11:18:10,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:18:10,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 11:18:15,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:18:15,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:15,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:16,265 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=83.00 vs. limit=7.925 2023-09-28 11:18:17,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:18:18,080 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=12.27 vs. limit=7.925 2023-09-28 11:18:18,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:18:22,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:18:22,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:22,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 11:18:25,122 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=24.86 vs. limit=8.4 2023-09-28 11:18:26,820 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.60 vs. limit=3.18 2023-09-28 11:18:31,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:31,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:18:33,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:18:33,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:18:37,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:38,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:18:40,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:18:41,960 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=23.47 vs. limit=7.975 2023-09-28 11:18:44,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:18:46,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:18:47,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:18:47,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 11:18:48,791 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=37.12 vs. limit=8.45 2023-09-28 11:18:50,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:50,163 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 11:18:54,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:57,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=1333.3333333333333, ans=0.09166666666666667 2023-09-28 11:18:58,375 INFO [train.py:1039] (0/4) Epoch 1, batch 200, loss[loss=0.944, simple_loss=0.7956, pruned_loss=0.9892, over 24093.00 frames. ], tot_loss[loss=1.539, simple_loss=1.362, pruned_loss=1.448, over 2981156.80 frames. ], batch size: 80, lr: 3.15e-02, grad_scale: 1.0 2023-09-28 11:19:00,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:19:01,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:19:03,419 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 9.506e+01 1.160e+02 1.347e+02 1.565e+02 3.276e+02, threshold=2.693e+02, percent-clipped=0.0 2023-09-28 11:19:05,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 11:19:05,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:05,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:09,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 11:19:11,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:19:11,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=1333.3333333333333, ans=0.4375 2023-09-28 11:19:12,075 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=32.94 vs. limit=8.5 2023-09-28 11:19:12,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:13,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:18,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:19:18,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:19:18,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:36,559 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.22 vs. limit=5.733333333333333 2023-09-28 11:19:39,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=1466.6666666666667, ans=0.43125 2023-09-28 11:19:42,033 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.16 vs. limit=8.6 2023-09-28 11:19:46,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:19:46,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:19:48,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:19:48,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:19:49,618 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=11.59 vs. limit=4.586666666666667 2023-09-28 11:19:50,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:19:50,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:19:50,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=1466.6666666666667, ans=0.09083333333333334 2023-09-28 11:19:52,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:52,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:19:52,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:52,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:19:54,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 11:19:55,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:19:55,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:00,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:20:04,376 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=92.75 vs. limit=8.075 2023-09-28 11:20:06,169 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=167.32 vs. limit=5.766666666666667 2023-09-28 11:20:09,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:20:18,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:18,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:20:24,114 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=59.78 vs. limit=8.1 2023-09-28 11:20:27,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:29,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 11:20:29,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:29,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:20:29,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:20:29,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:20:30,797 INFO [train.py:1039] (0/4) Epoch 1, batch 250, loss[loss=0.9082, simple_loss=0.7622, pruned_loss=0.91, over 24672.00 frames. ], tot_loss[loss=1.339, simple_loss=1.173, pruned_loss=1.279, over 3382289.32 frames. ], batch size: 65, lr: 3.38e-02, grad_scale: 1.0 2023-09-28 11:20:31,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 11:20:32,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:20:32,797 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 11:20:33,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:36,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:20:38,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:40,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:42,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:20:42,130 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=1666.6666666666667, ans=0.1375 2023-09-28 11:20:42,377 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=1666.6666666666667, ans=0.421875 2023-09-28 11:20:43,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:44,470 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=143.27 vs. limit=8.125 2023-09-28 11:20:45,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:20:51,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:20:54,464 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=26.39 vs. limit=8.15 2023-09-28 11:20:57,646 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.89 vs. limit=5.433333333333334 2023-09-28 11:20:59,623 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=18.35 vs. limit=5.433333333333334 2023-09-28 11:21:02,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:06,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:21:07,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:21:08,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=1800.0, ans=0.0595 2023-09-28 11:21:08,759 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=209.71 vs. limit=8.175 2023-09-28 11:21:15,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:21:15,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:21:17,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:21:17,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:18,348 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=19.54 vs. limit=8.175 2023-09-28 11:21:19,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:21:19,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:21:19,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:21,591 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=34.13 vs. limit=8.85 2023-09-28 11:21:23,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:21:26,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 11:21:26,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:26,740 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=1866.6666666666667, ans=0.4125 2023-09-28 11:21:30,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:21:30,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:21:30,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:21:30,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:21:32,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:21:32,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:21:34,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:34,343 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=1866.6666666666667, ans=0.4125 2023-09-28 11:21:35,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:21:35,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:39,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=1866.6666666666667, ans=0.057999999999999996 2023-09-28 11:21:40,094 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=25.97 vs. limit=8.2 2023-09-28 11:21:43,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:21:46,518 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=58.60 vs. limit=8.225 2023-09-28 11:21:46,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:49,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=44.07 vs. limit=8.225 2023-09-28 11:21:50,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:21:56,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:57,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:22:02,449 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=2000.0, ans=0.40625 2023-09-28 11:22:03,710 INFO [train.py:1039] (0/4) Epoch 1, batch 300, loss[loss=0.7459, simple_loss=0.6283, pruned_loss=0.7004, over 23706.00 frames. ], tot_loss[loss=1.204, simple_loss=1.046, pruned_loss=1.157, over 3662428.50 frames. ], batch size: 232, lr: 3.60e-02, grad_scale: 2.0 2023-09-28 11:22:03,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 11:22:03,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:05,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:22:07,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 11:22:07,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:22:08,695 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.74 vs. limit=9.0 2023-09-28 11:22:09,477 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 8.573e+01 1.074e+02 1.349e+02 1.820e+02 4.135e+02, threshold=2.699e+02, percent-clipped=10.0 2023-09-28 11:22:09,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:22:09,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 11:22:10,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=2000.0, ans=0.40625 2023-09-28 11:22:13,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:22:13,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:22:14,871 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=123.70 vs. limit=8.25 2023-09-28 11:22:17,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:22:17,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 11:22:19,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:22:19,555 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=2000.0, ans=0.055 2023-09-28 11:22:20,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:22:20,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 11:22:20,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:26,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:22:26,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=2066.6666666666665, ans=9.05 2023-09-28 11:22:32,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:22:32,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 11:22:36,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 11:22:37,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:39,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:40,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=2133.3333333333335, ans=0.08666666666666667 2023-09-28 11:22:42,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:42,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 11:22:42,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:22:42,732 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.15 vs. limit=9.1 2023-09-28 11:22:43,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:22:47,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:22:48,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:53,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:22:53,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 11:22:56,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:22:56,941 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.81 vs. limit=9.1 2023-09-28 11:22:57,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:59,059 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.87 vs. limit=9.15 2023-09-28 11:22:59,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 11:23:01,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:05,626 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=2200.0, ans=0.396875 2023-09-28 11:23:07,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=2200.0, ans=0.22499999999999998 2023-09-28 11:23:08,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:23:11,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:23:11,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 11:23:14,136 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=155.26 vs. limit=8.325 2023-09-28 11:23:16,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:16,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:23:18,857 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.79 vs. limit=8.35 2023-09-28 11:23:19,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:20,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:23:21,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 11:23:21,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:23:22,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:23:25,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 11:23:27,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:27,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:30,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:30,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:31,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:36,586 INFO [train.py:1039] (0/4) Epoch 1, batch 350, loss[loss=0.8495, simple_loss=0.6974, pruned_loss=0.8272, over 24516.00 frames. ], tot_loss[loss=1.108, simple_loss=0.9541, pruned_loss=1.063, over 3893883.16 frames. ], batch size: 63, lr: 3.83e-02, grad_scale: 2.0 2023-09-28 11:23:38,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:38,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:23:40,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:46,859 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=7.32 vs. limit=4.933333333333334 2023-09-28 11:23:48,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:50,485 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=2333.3333333333335, ans=0.0475 2023-09-28 11:23:51,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:52,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:57,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 11:23:57,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:57,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 11:24:00,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:01,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 11:24:02,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:04,159 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=15.11 vs. limit=5.6 2023-09-28 11:24:04,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 11:24:05,636 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.14 vs. limit=5.6 2023-09-28 11:24:08,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:24:08,611 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=2400.0, ans=0.27599999999999997 2023-09-28 11:24:10,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:10,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:24:12,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:12,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:14,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:24:14,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:24:14,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:17,111 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.91 vs. limit=8.425 2023-09-28 11:24:24,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:24:24,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:24:25,153 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=14.92 vs. limit=8.425 2023-09-28 11:24:26,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:24:26,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:29,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=2466.6666666666665, ans=0.2753333333333333 2023-09-28 11:24:32,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 11:24:32,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:35,560 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.63 vs. limit=8.45 2023-09-28 11:24:40,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:40,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:24:42,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:24:42,859 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.27 vs. limit=8.45 2023-09-28 11:24:43,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 11:24:44,922 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.18 vs. limit=6.266666666666667 2023-09-28 11:24:46,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:46,417 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 11:24:47,024 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=15.18 vs. limit=8.45 2023-09-28 11:24:48,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 11:24:48,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:52,229 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.86 vs. limit=9.45 2023-09-28 11:24:53,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:53,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 11:24:54,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:57,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:24:59,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:00,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:00,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:01,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=2600.0, ans=0.041499999999999995 2023-09-28 11:25:01,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=2600.0, ans=0.809 2023-09-28 11:25:02,002 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=216.80 vs. limit=8.475 2023-09-28 11:25:03,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:08,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:25:10,540 INFO [train.py:1039] (0/4) Epoch 1, batch 400, loss[loss=0.9315, simple_loss=0.7614, pruned_loss=0.8789, over 24645.00 frames. ], tot_loss[loss=1.049, simple_loss=0.8946, pruned_loss=1.002, over 4084665.69 frames. ], batch size: 73, lr: 4.05e-02, grad_scale: 4.0 2023-09-28 11:25:10,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:25:12,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 11:25:12,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:12,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:14,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:25:14,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:15,802 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 9.874e+01 1.367e+02 1.651e+02 2.389e+02 7.473e+02, threshold=3.302e+02, percent-clipped=14.0 2023-09-28 11:25:18,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:18,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=2666.6666666666665, ans=0.375 2023-09-28 11:25:19,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:21,054 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=21.20 vs. limit=8.5 2023-09-28 11:25:22,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 11:25:23,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 11:25:23,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:25,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 11:25:25,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:26,433 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=26.46 vs. limit=8.5 2023-09-28 11:25:28,298 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.34 vs. limit=8.525 2023-09-28 11:25:29,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:25:29,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:30,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 11:25:31,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:25:31,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:31,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:33,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:35,722 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 11:25:37,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 11:25:40,166 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.58 vs. limit=9.55 2023-09-28 11:25:42,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:42,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=2733.3333333333335, ans=0.0385 2023-09-28 11:25:44,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:45,132 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.20 vs. limit=9.55 2023-09-28 11:25:45,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 11:25:46,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 11:25:49,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:25:51,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:25:56,400 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=13.95 vs. limit=8.55 2023-09-28 11:25:57,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 11:26:02,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:26:04,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 11:26:05,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=2866.6666666666665, ans=0.2713333333333333 2023-09-28 11:26:08,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:26:09,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:26:09,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 11:26:10,738 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.65 vs. limit=5.1466666666666665 2023-09-28 11:26:15,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:26:17,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:26:19,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:26:19,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=2866.6666666666665, ans=0.365625 2023-09-28 11:26:21,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=2866.6666666666665, ans=0.0925 2023-09-28 11:26:22,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:22,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 11:26:24,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:26:27,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 11:26:28,741 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.90 vs. limit=8.6 2023-09-28 11:26:30,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:26:30,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:26:33,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 11:26:35,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:26:36,374 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.92 vs. limit=6.466666666666667 2023-09-28 11:26:37,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:26:37,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:26:40,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 11:26:40,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:26:40,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:26:42,300 INFO [train.py:1039] (0/4) Epoch 1, batch 450, loss[loss=0.8314, simple_loss=0.6851, pruned_loss=0.738, over 23734.00 frames. ], tot_loss[loss=1.007, simple_loss=0.8522, pruned_loss=0.9544, over 4216493.79 frames. ], batch size: 232, lr: 4.28e-02, grad_scale: 4.0 2023-09-28 11:26:42,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:26:42,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 11:26:42,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:26:44,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:26:48,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:26:57,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:59,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:01,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 11:27:03,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 11:27:03,721 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=40.35 vs. limit=6.533333333333333 2023-09-28 11:27:03,910 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.37 vs. limit=8.65 2023-09-28 11:27:05,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=3066.6666666666665, ans=0.35625 2023-09-28 11:27:05,682 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.24 vs. limit=9.8 2023-09-28 11:27:08,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:27:11,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:13,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:16,433 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.10 vs. limit=8.65 2023-09-28 11:27:19,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:19,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:19,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=3133.3333333333335, ans=0.353125 2023-09-28 11:27:21,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 11:27:22,424 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=20.26 vs. limit=8.675 2023-09-28 11:27:22,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 11:27:24,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 11:27:26,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:27:28,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:28,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:27:29,324 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=24.10 vs. limit=6.566666666666666 2023-09-28 11:27:31,086 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 11:27:31,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten.whitening_limit, batch_count=3133.3333333333335, ans=8.675 2023-09-28 11:27:32,570 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 11:27:32,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:34,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:27:36,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:27:39,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:27:39,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:27:41,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:27:41,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 11:27:44,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:46,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:27:48,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:27:49,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 11:27:50,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=3200.0, ans=0.35 2023-09-28 11:27:53,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:27:56,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 11:27:56,193 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=4.108e+01 2023-09-28 11:27:56,972 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=9.95 2023-09-28 11:27:57,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 11:27:59,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:28:01,757 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.96 vs. limit=8.725 2023-09-28 11:28:04,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:28:05,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:09,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:28:09,119 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 11:28:12,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:12,982 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=12.35 vs. limit=6.666666666666667 2023-09-28 11:28:13,442 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.45 vs. limit=8.75 2023-09-28 11:28:13,986 INFO [train.py:1039] (0/4) Epoch 1, batch 500, loss[loss=0.8066, simple_loss=0.6631, pruned_loss=0.696, over 22757.00 frames. ], tot_loss[loss=0.974, simple_loss=0.8171, pruned_loss=0.9127, over 4333574.37 frames. ], batch size: 322, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:28:14,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:28:14,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:14,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=3333.3333333333335, ans=0.26666666666666666 2023-09-28 11:28:15,749 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 11:28:15,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 11:28:15,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:19,323 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 9.903e+01 1.529e+02 1.913e+02 2.430e+02 4.167e+02, threshold=3.825e+02, percent-clipped=6.0 2023-09-28 11:28:19,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:28:26,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:28:28,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:28:28,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=3333.3333333333335, ans=0.04949747468305833 2023-09-28 11:28:31,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:31,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:33,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:28:37,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=3400.0, ans=0.340625 2023-09-28 11:28:43,017 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=1.83 vs. limit=3.51 2023-09-28 11:28:47,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:47,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:28:47,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:28:47,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:48,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 11:28:48,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:28:52,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:28:53,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:28:53,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:28:53,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:53,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 11:28:54,309 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=3466.6666666666665, ans=0.07 2023-09-28 11:28:55,838 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 11:28:59,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:28:59,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:01,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:29:04,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 11:29:08,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:29:10,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:14,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:18,237 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=3533.3333333333335, ans=0.334375 2023-09-28 11:29:19,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:26,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:30,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 11:29:30,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:30,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=3600.0, ans=0.774 2023-09-28 11:29:31,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:34,218 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=19.25 vs. limit=8.85 2023-09-28 11:29:35,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 11:29:35,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:29:36,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:43,555 INFO [train.py:1039] (0/4) Epoch 1, batch 550, loss[loss=0.7773, simple_loss=0.6317, pruned_loss=0.6685, over 23471.00 frames. ], tot_loss[loss=0.9485, simple_loss=0.7898, pruned_loss=0.8768, over 4422891.85 frames. ], batch size: 134, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:29:43,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 11:29:45,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 11:29:45,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:45,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 11:29:47,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:29:47,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:49,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:29:51,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:29:53,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:56,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 11:29:56,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:30:00,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:00,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:04,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:05,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:11,514 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.51 vs. limit=5.493333333333333 2023-09-28 11:30:12,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 11:30:12,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 11:30:13,085 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=67.24 vs. limit=10.3 2023-09-28 11:30:14,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:30:19,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:30:21,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:21,951 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.30 vs. limit=8.925 2023-09-28 11:30:22,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:30:27,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:27,086 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 11:30:27,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:28,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:30:32,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:35,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:30:35,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:30:37,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:38,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 11:30:40,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 11:30:40,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:40,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:42,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:30:42,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:30:42,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=3866.6666666666665, ans=0.7646666666666667 2023-09-28 11:30:43,018 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.35 vs. limit=8.95 2023-09-28 11:30:45,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:30:45,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:30:46,310 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.61 vs. limit=6.933333333333334 2023-09-28 11:30:48,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:30:50,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:51,155 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.37 vs. limit=10.4 2023-09-28 11:30:51,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:30:51,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:30:53,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:55,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:30:55,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:57,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:30:58,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:31:07,184 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.81 vs. limit=8.975 2023-09-28 11:31:08,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 11:31:08,998 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.52 vs. limit=8.975 2023-09-28 11:31:13,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 11:31:14,156 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.10 vs. limit=9.0 2023-09-28 11:31:14,893 INFO [train.py:1039] (0/4) Epoch 1, batch 600, loss[loss=0.7776, simple_loss=0.6282, pruned_loss=0.6579, over 24314.00 frames. ], tot_loss[loss=0.9259, simple_loss=0.7664, pruned_loss=0.842, over 4498383.78 frames. ], batch size: 56, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:31:14,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:31:15,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:31:15,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:31:15,996 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=16.24 vs. limit=9.0 2023-09-28 11:31:17,588 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.71 vs. limit=10.5 2023-09-28 11:31:21,668 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.125e+02 1.678e+02 2.306e+02 3.262e+02 8.742e+02, threshold=4.612e+02, percent-clipped=14.0 2023-09-28 11:31:23,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:31:24,277 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.91 vs. limit=9.0 2023-09-28 11:31:25,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:31:26,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 11:31:28,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:31:30,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:31:31,424 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.79 vs. limit=9.025 2023-09-28 11:31:32,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:34,618 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.62 vs. limit=10.55 2023-09-28 11:31:35,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 11:31:37,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:31:44,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 11:31:48,375 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.40 vs. limit=9.025 2023-09-28 11:31:49,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:31:49,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:49,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:31:53,674 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.62 vs. limit=9.05 2023-09-28 11:31:55,292 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=35.47 vs. limit=9.05 2023-09-28 11:31:56,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:31:56,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:31:57,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:31:58,253 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=4133.333333333333, ans=0.7553333333333334 2023-09-28 11:32:00,317 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.92 vs. limit=6.033333333333333 2023-09-28 11:32:06,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:32:10,912 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=31.08 vs. limit=9.075 2023-09-28 11:32:11,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:11,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:32:11,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:32:17,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 11:32:24,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:32:24,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:32:26,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=4266.666666666667, ans=0.7506666666666667 2023-09-28 11:32:28,844 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=4266.666666666667, ans=6.066666666666666 2023-09-28 11:32:29,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 11:32:29,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:32:33,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 11:32:33,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:32:33,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:32:35,856 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=10.50 vs. limit=10.7 2023-09-28 11:32:42,887 INFO [train.py:1039] (0/4) Epoch 1, batch 650, loss[loss=0.8871, simple_loss=0.729, pruned_loss=0.7043, over 24645.00 frames. ], tot_loss[loss=0.9011, simple_loss=0.7435, pruned_loss=0.8025, over 4532113.68 frames. ], batch size: 68, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:32:42,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:32:45,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:32:48,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:32:48,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:32:51,161 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=21.51 vs. limit=9.125 2023-09-28 11:32:52,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:32:55,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 11:32:56,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:59,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=4400.0, ans=0.04833333333333334 2023-09-28 11:33:03,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:33:03,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:05,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:09,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 11:33:10,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:10,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:15,101 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=45.30 vs. limit=9.15 2023-09-28 11:33:15,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:15,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:33:19,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:19,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:19,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:33:20,821 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.46 vs. limit=6.116666666666667 2023-09-28 11:33:21,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:23,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:33:25,833 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=14.46 vs. limit=9.175 2023-09-28 11:33:26,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:33:26,435 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 11:33:26,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:26,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:27,293 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=24.21 vs. limit=9.175 2023-09-28 11:33:31,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:33,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:33,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:33,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:33:33,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=4466.666666666667, ans=0.290625 2023-09-28 11:33:34,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=4466.666666666667, ans=6.116666666666667 2023-09-28 11:33:35,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 11:33:37,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:33:37,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:33:37,786 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=4533.333333333333, ans=0.2875 2023-09-28 11:33:39,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:33:39,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:39,956 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.50 vs. limit=9.2 2023-09-28 11:33:40,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:33:42,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 11:33:42,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 11:33:42,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:42,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:42,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:33:43,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=4533.333333333333, ans=0.04777777777777778 2023-09-28 11:33:44,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:46,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:48,716 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.95 vs. limit=6.133333333333333 2023-09-28 11:33:53,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:53,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:54,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:58,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:58,871 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.55 vs. limit=6.15 2023-09-28 11:33:59,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:33:59,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:34:05,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:34:05,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:06,461 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.04 vs. limit=10.95 2023-09-28 11:34:07,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:07,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:13,628 INFO [train.py:1039] (0/4) Epoch 1, batch 700, loss[loss=0.7661, simple_loss=0.6297, pruned_loss=0.5938, over 23938.00 frames. ], tot_loss[loss=0.8732, simple_loss=0.7199, pruned_loss=0.7588, over 4566910.68 frames. ], batch size: 195, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:34:15,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 11:34:16,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 11:34:17,681 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=7.28 vs. limit=5.866666666666667 2023-09-28 11:34:20,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 11:34:20,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:21,818 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.160e+02 1.725e+02 2.743e+02 3.715e+02 1.987e+03, threshold=5.486e+02, percent-clipped=15.0 2023-09-28 11:34:22,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:34:22,771 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.47 vs. limit=9.25 2023-09-28 11:34:25,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 11:34:29,272 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.25 vs. limit=9.275 2023-09-28 11:34:30,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:33,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:34:35,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:35,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:34:37,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:34:40,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:44,105 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.59 vs. limit=11.05 2023-09-28 11:34:44,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:34:44,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:34:46,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 11:34:51,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 11:34:55,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:34:57,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:34:58,187 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.26 vs. limit=6.2 2023-09-28 11:34:58,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:35:02,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:35:04,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 11:35:06,670 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.61 vs. limit=11.15 2023-09-28 11:35:06,715 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.15 vs. limit=6.216666666666667 2023-09-28 11:35:09,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:11,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:35:11,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 11:35:14,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:35:16,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:20,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:35:21,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=4933.333333333333, ans=0.26875 2023-09-28 11:35:27,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:35:28,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 11:35:31,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 11:35:31,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 11:35:33,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:34,072 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.03 vs. limit=11.2 2023-09-28 11:35:34,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:35,628 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.54 vs. limit=6.233333333333333 2023-09-28 11:35:36,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:35:39,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:39,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 11:35:41,543 INFO [train.py:1039] (0/4) Epoch 1, batch 750, loss[loss=0.717, simple_loss=0.593, pruned_loss=0.5374, over 23685.00 frames. ], tot_loss[loss=0.8428, simple_loss=0.6959, pruned_loss=0.7124, over 4596208.03 frames. ], batch size: 212, lr: 4.49e-02, grad_scale: 4.0 2023-09-28 11:35:44,233 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.06 vs. limit=9.375 2023-09-28 11:35:44,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 11:35:44,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 11:35:44,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 11:35:46,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 11:35:46,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 11:35:46,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:35:48,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 11:35:49,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:49,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:35:51,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:35:53,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:53,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:35:54,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:56,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:35:59,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:36:04,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:36:06,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:06,732 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.50 vs. limit=9.4 2023-09-28 11:36:07,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:07,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 11:36:09,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:36:11,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:12,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:14,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:36:16,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 11:36:16,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:36:19,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 11:36:19,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 11:36:19,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 11:36:19,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:36:19,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:36:19,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=5133.333333333333, ans=0.7203333333333334 2023-09-28 11:36:22,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:36:24,931 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=5133.333333333333, ans=0.009753623188405797 2023-09-28 11:36:30,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:36:31,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:31,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:36:31,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=5200.0, ans=0.00973913043478261 2023-09-28 11:36:31,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=5200.0, ans=0.7180000000000001 2023-09-28 11:36:32,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:35,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:36:35,910 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.69 vs. limit=6.3 2023-09-28 11:36:36,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 11:36:38,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:36:39,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:36:39,780 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.52 vs. limit=9.45 2023-09-28 11:36:40,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:36:41,621 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.21 vs. limit=9.45 2023-09-28 11:36:42,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:36:42,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 11:36:44,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:50,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:36:50,729 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=5266.666666666667, ans=0.7156666666666667 2023-09-28 11:36:52,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:36:52,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:55,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:37:00,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 11:37:00,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:02,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:07,768 INFO [train.py:1039] (0/4) Epoch 1, batch 800, loss[loss=0.7214, simple_loss=0.6141, pruned_loss=0.5001, over 24648.00 frames. ], tot_loss[loss=0.8112, simple_loss=0.6725, pruned_loss=0.6648, over 4621463.55 frames. ], batch size: 68, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:37:08,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:10,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:37:16,636 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 4.125e+02 6.476e+02 9.801e+02 2.445e+03, threshold=1.295e+03, percent-clipped=55.0 2023-09-28 11:37:19,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:19,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:21,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:21,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:23,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:23,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:26,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:30,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:30,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:37:30,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=5400.0, ans=0.246875 2023-09-28 11:37:30,853 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.42 vs. limit=3.81 2023-09-28 11:37:33,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 11:37:35,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:35,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:35,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:37:35,993 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.92 vs. limit=9.525 2023-09-28 11:37:36,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:36,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 11:37:36,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:38,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 11:37:40,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:40,878 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=7.44 vs. limit=6.16 2023-09-28 11:37:44,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:47,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:47,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:50,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:52,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:52,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=5466.666666666667, ans=0.24375000000000002 2023-09-28 11:37:56,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:37:57,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:37:57,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:38:01,293 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 11:38:01,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 11:38:01,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:38:01,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:01,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=5533.333333333333, ans=0.24062499999999998 2023-09-28 11:38:03,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:03,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:09,848 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 11:38:09,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 11:38:11,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:38:13,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:38:18,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:38:21,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:38:23,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 11:38:23,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:38:27,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 11:38:34,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:36,583 INFO [train.py:1039] (0/4) Epoch 1, batch 850, loss[loss=0.6629, simple_loss=0.5584, pruned_loss=0.4623, over 22701.00 frames. ], tot_loss[loss=0.7743, simple_loss=0.6455, pruned_loss=0.6148, over 4650520.79 frames. ], batch size: 322, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:38:38,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:38:40,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 11:38:40,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:38:40,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:40,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 11:38:40,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:40,709 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 11:38:43,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:38:45,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:45,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:38:46,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:47,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=5666.666666666667, ans=0.234375 2023-09-28 11:38:48,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 11:38:50,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 11:38:50,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 11:38:51,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:51,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:38:53,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:53,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:54,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:38:57,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=5733.333333333333, ans=0.23125 2023-09-28 11:38:59,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=5733.333333333333, ans=0.23125 2023-09-28 11:39:00,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=5733.333333333333, ans=0.035 2023-09-28 11:39:01,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:01,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:03,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 11:39:06,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 11:39:10,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:10,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 11:39:16,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 11:39:16,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 11:39:19,965 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 11:39:19,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:19,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:39:20,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:39:23,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:24,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:24,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 11:39:26,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:28,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:29,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:39:30,515 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.95 vs. limit=11.9 2023-09-28 11:39:30,614 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.34 vs. limit=6.346666666666667 2023-09-28 11:39:31,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:39:33,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:39:35,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:39:35,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 11:39:37,246 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=5866.666666666667, ans=0.22499999999999998 2023-09-28 11:39:38,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=5866.666666666667, ans=0.6946666666666667 2023-09-28 11:39:40,945 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.78 vs. limit=11.9 2023-09-28 11:39:42,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:39:42,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:39:42,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:39:42,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:39:44,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:45,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:47,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:39:49,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:39:50,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:39:51,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:39:57,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=5933.333333333333, ans=0.221875 2023-09-28 11:40:00,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:40:01,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:40:03,137 INFO [train.py:1039] (0/4) Epoch 1, batch 900, loss[loss=0.6339, simple_loss=0.5504, pruned_loss=0.4105, over 24014.00 frames. ], tot_loss[loss=0.7426, simple_loss=0.6226, pruned_loss=0.5713, over 4666823.34 frames. ], batch size: 86, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:40:03,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 11:40:03,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:03,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:40:06,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 11:40:10,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:40:12,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:13,966 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 3.628e+02 6.882e+02 1.109e+03 2.718e+03, threshold=1.376e+03, percent-clipped=19.0 2023-09-28 11:40:14,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 11:40:14,677 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.93 vs. limit=9.75 2023-09-28 11:40:17,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:40:17,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 11:40:18,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:40:19,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:19,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:21,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:40:21,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:40:36,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:40:36,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:36,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:40:38,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:43,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 11:40:46,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:40:52,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:40:54,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:40:54,141 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 11:40:55,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 11:41:01,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:41:01,472 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=6200.0, ans=0.20937499999999998 2023-09-28 11:41:02,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:41:02,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:41:09,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:09,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:41:11,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 11:41:13,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:41:13,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 11:41:16,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:41:16,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:17,291 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.01 vs. limit=8.133333333333333 2023-09-28 11:41:17,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:41:17,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:21,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 11:41:23,517 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 11:41:26,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:41:26,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 11:41:29,589 INFO [train.py:1039] (0/4) Epoch 1, batch 950, loss[loss=0.5405, simple_loss=0.482, pruned_loss=0.328, over 24301.00 frames. ], tot_loss[loss=0.7111, simple_loss=0.6, pruned_loss=0.5305, over 4681298.06 frames. ], batch size: 61, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:41:29,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:33,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 11:41:34,568 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.05 vs. limit=12.25 2023-09-28 11:41:37,777 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.27 vs. limit=6.583333333333333 2023-09-28 11:41:39,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:42,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:42,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:42,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=6333.333333333333, ans=0.23666666666666666 2023-09-28 11:41:43,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:41:46,841 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 11:41:51,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:53,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:41:53,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:53,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:41:53,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 11:41:54,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:41:56,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:56,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 11:41:59,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:04,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:04,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:04,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:42:05,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 11:42:07,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:42:11,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:42:11,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=6466.666666666667, ans=0.19687500000000002 2023-09-28 11:42:12,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:42:18,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:42:18,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:42:21,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 11:42:23,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:42:23,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:42:23,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=6533.333333333333, ans=0.6713333333333333 2023-09-28 11:42:25,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:25,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:25,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:42:30,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 11:42:32,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:42:33,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:35,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:35,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 11:42:35,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:35,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:42:36,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 11:42:42,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:42:42,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=6600.0, ans=0.009434782608695652 2023-09-28 11:42:46,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:51,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:42:52,491 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.55 vs. limit=9.975 2023-09-28 11:42:53,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 11:42:53,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 11:42:56,624 INFO [train.py:1039] (0/4) Epoch 1, batch 1000, loss[loss=0.5414, simple_loss=0.4834, pruned_loss=0.3251, over 24563.00 frames. ], tot_loss[loss=0.6805, simple_loss=0.5784, pruned_loss=0.4922, over 4686667.45 frames. ], batch size: 60, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:42:58,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:43:01,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 11:43:03,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:06,716 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.970e+02 4.014e+02 6.511e+02 1.253e+03 2.271e+03, threshold=1.302e+03, percent-clipped=16.0 2023-09-28 11:43:08,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:43:10,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 11:43:10,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 11:43:15,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:15,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:43:15,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:19,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 11:43:25,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 11:43:27,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 11:43:27,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:27,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=6733.333333333333, ans=0.009405797101449275 2023-09-28 11:43:28,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 11:43:29,764 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=2.565e-03 2023-09-28 11:43:32,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:43:32,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 11:43:32,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:34,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:41,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=6800.0, ans=0.009391304347826087 2023-09-28 11:43:43,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:44,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:43:45,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:45,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:45,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 11:43:47,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:47,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:43:49,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:49,376 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 11:43:52,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 11:43:54,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 11:43:56,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 11:43:57,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:44:00,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=6866.666666666667, ans=0.04949747468305833 2023-09-28 11:44:03,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:04,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:44:05,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:06,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:44:09,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 11:44:09,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:44:09,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=6933.333333333333, ans=0.175 2023-09-28 11:44:10,016 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=6933.333333333333, ans=0.175 2023-09-28 11:44:11,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 11:44:11,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 11:44:12,269 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.16 vs. limit=12.7 2023-09-28 11:44:12,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:12,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:44:15,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:44:16,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:44:20,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:44:21,956 INFO [train.py:1039] (0/4) Epoch 1, batch 1050, loss[loss=0.4911, simple_loss=0.4138, pruned_loss=0.3247, over 19235.00 frames. ], tot_loss[loss=0.6518, simple_loss=0.5582, pruned_loss=0.4576, over 4691392.30 frames. ], batch size: 389, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:44:25,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:44:26,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:44:28,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:44:30,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:32,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=7000.0, ans=0.22999999999999998 2023-09-28 11:44:33,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:44:35,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:44:36,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:44:37,195 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=7066.666666666667, ans=0.037222222222222226 2023-09-28 11:44:40,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:44:40,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:44:40,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:44:42,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:44:42,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 11:44:42,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=7066.666666666667, ans=0.22933333333333333 2023-09-28 11:44:43,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:44:43,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 11:44:47,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:47,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 11:44:47,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:44:51,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=7066.666666666667, ans=0.6526666666666667 2023-09-28 11:44:56,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:58,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:44:58,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:45:01,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 11:45:01,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 11:45:01,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:45:02,045 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.95 vs. limit=6.783333333333333 2023-09-28 11:45:05,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 11:45:08,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 11:45:09,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:12,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:45:14,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:45:14,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:45:16,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:45:19,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:45:22,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 11:45:25,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 11:45:25,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 11:45:26,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:26,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:45:28,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 11:45:32,503 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=7266.666666666667, ans=0.22733333333333333 2023-09-28 11:45:33,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:45:35,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:35,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:45:36,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:36,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:40,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:40,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 11:45:43,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:43,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 11:45:43,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 11:45:45,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:45:46,848 INFO [train.py:1039] (0/4) Epoch 1, batch 1100, loss[loss=0.5236, simple_loss=0.4639, pruned_loss=0.3148, over 23692.00 frames. ], tot_loss[loss=0.6262, simple_loss=0.5402, pruned_loss=0.4274, over 4699250.60 frames. ], batch size: 149, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:45:48,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:45:49,346 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.95 vs. limit=8.666666666666666 2023-09-28 11:45:55,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:45:58,774 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 4.555e+02 7.978e+02 1.389e+03 3.645e+03, threshold=1.596e+03, percent-clipped=29.0 2023-09-28 11:46:00,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:46:00,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:46:00,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:02,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 11:46:04,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:46:07,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:46:08,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:46:10,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:46:10,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 11:46:12,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:46:13,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:13,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:46:17,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:46:20,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:46:23,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:46:27,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 11:46:28,887 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 11:46:28,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:29,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=7466.666666666667, ans=0.15000000000000002 2023-09-28 11:46:32,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:32,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:46:34,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:46:34,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 11:46:34,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:46:34,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:46:35,183 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=7466.666666666667, ans=0.312 2023-09-28 11:46:36,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:46:36,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:37,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 11:46:39,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=7533.333333333333, ans=0.03527777777777778 2023-09-28 11:46:42,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:46:43,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 11:46:45,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:46:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:46:54,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 11:46:54,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:46:56,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:56,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=7600.0, ans=0.035 2023-09-28 11:46:59,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:59,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:01,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 11:47:03,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:47:04,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:06,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 11:47:06,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:47:08,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 11:47:09,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:47:09,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:47:09,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:47:12,099 INFO [train.py:1039] (0/4) Epoch 1, batch 1150, loss[loss=0.5021, simple_loss=0.4716, pruned_loss=0.269, over 24424.00 frames. ], tot_loss[loss=0.6052, simple_loss=0.5258, pruned_loss=0.402, over 4704570.06 frames. ], batch size: 69, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:47:15,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:18,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:47:20,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:47:21,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:47:21,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 11:47:21,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:25,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 11:47:26,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:26,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:47:28,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=7733.333333333333, ans=0.22266666666666668 2023-09-28 11:47:32,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 11:47:35,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:38,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=7733.333333333333, ans=0.0 2023-09-28 11:47:39,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:41,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:47:41,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 11:47:43,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:47:43,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:44,177 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.87 vs. limit=6.95 2023-09-28 11:47:46,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 11:47:48,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:51,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:47:55,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=7800.0, ans=0.13437500000000002 2023-09-28 11:47:58,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=7800.0, ans=0.13437500000000002 2023-09-28 11:47:59,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:07,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:07,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 11:48:09,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:09,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:16,033 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 11:48:18,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:24,390 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.98 vs. limit=13.45 2023-09-28 11:48:26,764 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 11:48:30,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:30,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=7933.333333333333, ans=0.22066666666666668 2023-09-28 11:48:33,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:48:33,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:48:33,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:48:34,811 INFO [train.py:1039] (0/4) Epoch 1, batch 1200, loss[loss=0.6899, simple_loss=0.5786, pruned_loss=0.445, over 19262.00 frames. ], tot_loss[loss=0.5872, simple_loss=0.5141, pruned_loss=0.3799, over 4709065.20 frames. ], batch size: 388, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:48:37,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:41,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:48:41,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:48:43,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:48:43,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:44,066 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.97 vs. limit=7.0 2023-09-28 11:48:44,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:48:46,294 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 4.760e+02 7.806e+02 1.164e+03 2.947e+03, threshold=1.561e+03, percent-clipped=14.0 2023-09-28 11:48:46,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:48:48,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:48:50,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:50,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:51,844 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 11:48:54,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 11:49:00,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:49:01,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:49:03,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:05,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=8066.666666666667, ans=0.009115942028985507 2023-09-28 11:49:06,163 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=9.49 vs. limit=9.033333333333333 2023-09-28 11:49:06,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:06,748 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 11:49:08,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:13,613 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=8133.333333333333, ans=0.125 2023-09-28 11:49:18,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:49:18,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:49:18,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 11:49:19,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:49:23,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 11:49:25,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 11:49:26,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:28,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:49:28,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:30,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:49:32,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:32,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:49:34,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:49:34,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 11:49:35,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:49:35,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:36,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:49:38,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:49:38,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:43,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:49:45,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:49:48,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 11:49:53,397 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 11:49:55,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:58,119 INFO [train.py:1039] (0/4) Epoch 1, batch 1250, loss[loss=0.5674, simple_loss=0.4958, pruned_loss=0.3419, over 22699.00 frames. ], tot_loss[loss=0.5724, simple_loss=0.5045, pruned_loss=0.3615, over 4709992.39 frames. ], batch size: 322, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:49:58,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:59,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:50:01,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:50:04,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 11:50:08,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:50:09,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:09,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 11:50:10,703 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=10.625 2023-09-28 11:50:11,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:50:12,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:50:15,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:50:18,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:19,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:50:19,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:21,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:50:24,028 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.50 vs. limit=13.8 2023-09-28 11:50:26,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:50:26,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:50:26,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:50:27,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:29,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:30,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:32,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:50:35,509 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.09 vs. limit=4.27 2023-09-28 11:50:38,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 11:50:38,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:50:41,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:50:41,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 11:50:41,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:42,843 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 11:50:42,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:42,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:47,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:51,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=8533.333333333334, ans=0.21466666666666667 2023-09-28 11:50:52,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:52,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:50:54,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 11:50:54,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 11:50:55,090 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.17 vs. limit=10.7 2023-09-28 11:50:55,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 11:50:58,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:00,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 11:51:00,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:00,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=8533.333333333334, ans=0.125 2023-09-28 11:51:04,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:51:04,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:51:07,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 11:51:07,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:51:07,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:51:10,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:51:10,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:12,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 11:51:13,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:17,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:51:18,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:51:20,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:51:21,989 INFO [train.py:1039] (0/4) Epoch 1, batch 1300, loss[loss=0.4745, simple_loss=0.4303, pruned_loss=0.2687, over 24461.00 frames. ], tot_loss[loss=0.5571, simple_loss=0.4946, pruned_loss=0.3439, over 4714981.76 frames. ], batch size: 58, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:51:23,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:23,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 11:51:30,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:31,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:51:32,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:51:34,984 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 3.707e+02 6.388e+02 1.142e+03 3.121e+03, threshold=1.278e+03, percent-clipped=13.0 2023-09-28 11:51:35,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:38,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:51:38,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 11:51:43,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:51:45,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:51:47,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 11:51:50,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:51:52,817 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=8733.333333333334, ans=0.008971014492753624 2023-09-28 11:51:55,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:51:55,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:57,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:57,802 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=8800.0, ans=0.030000000000000002 2023-09-28 11:51:58,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:00,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:52:00,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:52:00,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 11:52:02,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=8800.0, ans=0.008956521739130436 2023-09-28 11:52:08,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:52:08,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:52:10,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 11:52:10,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:52:11,052 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.69 vs. limit=7.216666666666667 2023-09-28 11:52:11,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:52:14,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:52:15,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 11:52:16,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:16,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 11:52:20,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:22,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:52:22,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:52:25,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 11:52:27,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 11:52:29,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 11:52:32,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:52:34,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=8933.333333333334, ans=0.125 2023-09-28 11:52:36,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 11:52:39,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:44,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=9000.0, ans=0.125 2023-09-28 11:52:45,143 INFO [train.py:1039] (0/4) Epoch 1, batch 1350, loss[loss=0.4688, simple_loss=0.4408, pruned_loss=0.2498, over 24622.00 frames. ], tot_loss[loss=0.5441, simple_loss=0.4852, pruned_loss=0.3301, over 4703802.68 frames. ], batch size: 60, lr: 4.46e-02, grad_scale: 4.0 2023-09-28 11:52:46,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 11:52:49,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:51,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:52:55,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:55,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:56,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:52:57,107 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=9000.0, ans=0.21000000000000002 2023-09-28 11:52:58,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:03,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:05,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 11:53:05,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:06,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:53:07,051 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=9066.666666666666, ans=0.125 2023-09-28 11:53:10,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 11:53:11,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:53:12,041 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 11:53:13,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:53:13,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 11:53:15,785 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.56 vs. limit=7.266666666666667 2023-09-28 11:53:16,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 11:53:18,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 11:53:19,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:19,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 11:53:19,862 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=9133.333333333334, ans=0.125 2023-09-28 11:53:30,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:39,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:39,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:41,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 11:53:42,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:44,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=9200.0, ans=0.025 2023-09-28 11:53:45,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 11:53:45,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:46,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:53:46,836 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.24 vs. limit=10.95 2023-09-28 11:53:49,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:53:50,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 11:53:52,780 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.52 vs. limit=10.975 2023-09-28 11:53:53,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:53:55,526 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=9266.666666666666, ans=0.125 2023-09-28 11:54:00,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 11:54:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 11:54:05,888 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=9266.666666666666, ans=0.5756666666666668 2023-09-28 11:54:09,173 INFO [train.py:1039] (0/4) Epoch 1, batch 1400, loss[loss=0.5171, simple_loss=0.4886, pruned_loss=0.2732, over 24572.00 frames. ], tot_loss[loss=0.5274, simple_loss=0.4746, pruned_loss=0.313, over 4713523.55 frames. ], batch size: 71, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:54:09,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 11:54:11,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:54:16,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:54:16,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:54:17,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=9333.333333333334, ans=0.125 2023-09-28 11:54:22,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 11:54:23,721 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 3.566e+02 5.835e+02 9.354e+02 4.572e+03, threshold=1.167e+03, percent-clipped=13.0 2023-09-28 11:54:23,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 11:54:27,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=9400.0, ans=0.20600000000000002 2023-09-28 11:54:32,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=9400.0, ans=0.125 2023-09-28 11:54:33,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:54:33,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=9400.0, ans=0.125 2023-09-28 11:54:35,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:54:37,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:54:37,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:54:40,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:54:42,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:54:49,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=9466.666666666666, ans=0.125 2023-09-28 11:54:52,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:52,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:57,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 11:54:58,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:54:58,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:55:00,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:55:00,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:02,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:55:02,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:55:02,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:55:05,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 11:55:05,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:55:10,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:12,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=9533.333333333334, ans=0.5663333333333334 2023-09-28 11:55:13,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:55:23,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 11:55:25,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:55:25,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:55:28,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:55:30,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:31,861 INFO [train.py:1039] (0/4) Epoch 1, batch 1450, loss[loss=0.4982, simple_loss=0.4538, pruned_loss=0.2776, over 23432.00 frames. ], tot_loss[loss=0.5161, simple_loss=0.4678, pruned_loss=0.3006, over 4721821.23 frames. ], batch size: 119, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:55:31,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:55:35,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:55:36,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:55:36,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:36,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:55:37,090 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 11:55:42,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:44,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:55:44,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:44,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 11:55:46,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:55:48,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 11:55:50,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:50,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:50,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 11:55:52,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:55:54,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:55:56,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:55:56,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:57,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:55:59,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:00,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:04,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:56:04,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:56:07,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:56:07,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:08,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:08,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:56:10,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:10,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:13,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 11:56:18,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:56:21,176 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 11:56:21,561 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=9866.666666666666, ans=0.20133333333333334 2023-09-28 11:56:23,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:25,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:56:27,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:29,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 11:56:33,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:35,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 11:56:36,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 11:56:38,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:41,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:56:41,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:43,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 11:56:45,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 11:56:45,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 11:56:46,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:48,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:56:54,941 INFO [train.py:1039] (0/4) Epoch 1, batch 1500, loss[loss=0.4763, simple_loss=0.4531, pruned_loss=0.2491, over 24642.00 frames. ], tot_loss[loss=0.5066, simple_loss=0.4623, pruned_loss=0.2902, over 4716997.24 frames. ], batch size: 65, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:56:59,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 11:56:59,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:56:59,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:57:00,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:01,094 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=10000.0, ans=0.125 2023-09-28 11:57:02,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:02,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:57:04,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 11:57:06,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:57:06,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:57:06,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:07,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:57:10,818 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.048e+02 3.499e+02 5.909e+02 9.288e+02 2.563e+03, threshold=1.182e+03, percent-clipped=18.0 2023-09-28 11:57:10,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:57:12,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:18,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:18,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 11:57:18,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:57:18,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:57:20,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:20,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=10066.666666666666, ans=0.125 2023-09-28 11:57:23,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 11:57:25,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=10066.666666666666, ans=0.125 2023-09-28 11:57:26,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 11:57:28,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:28,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 11:57:32,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:57:35,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:57:36,842 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.88 vs. limit=11.3 2023-09-28 11:57:37,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:37,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:57:38,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 11:57:39,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:57:39,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:41,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 11:57:42,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:47,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:57:47,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 11:57:53,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:57:55,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:57:59,694 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 11:58:01,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:01,200 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 11:58:02,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:04,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:06,029 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 11:58:06,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:58:09,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 11:58:10,246 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=10266.666666666666, ans=0.5406666666666667 2023-09-28 11:58:11,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:16,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:16,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,382 INFO [train.py:1039] (0/4) Epoch 1, batch 1550, loss[loss=0.4772, simple_loss=0.4363, pruned_loss=0.2632, over 23386.00 frames. ], tot_loss[loss=0.4985, simple_loss=0.4581, pruned_loss=0.281, over 4722922.01 frames. ], batch size: 285, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 11:58:18,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:18,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:58:20,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 11:58:21,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 11:58:21,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:58:23,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 11:58:23,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 11:58:25,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:26,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:26,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:58:26,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:58:28,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=10333.333333333334, ans=0.125 2023-09-28 11:58:29,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:29,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:31,394 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 11:58:32,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:32,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:58:32,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:58:36,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:58:36,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 11:58:37,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:37,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 11:58:39,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 11:58:39,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 11:58:39,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:42,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:58:42,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=10400.0, ans=0.125 2023-09-28 11:58:47,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:49,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 11:58:49,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 11:58:56,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=10466.666666666666, ans=0.19533333333333333 2023-09-28 11:58:58,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:02,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:59:02,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:59:02,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:59:03,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 11:59:09,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:59:11,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:15,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:59:15,651 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.13 vs. limit=4.58 2023-09-28 11:59:18,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:59:18,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:20,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 11:59:20,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:20,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:59:21,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:23,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:59:23,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 11:59:25,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:31,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 11:59:33,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=10600.0, ans=0.125 2023-09-28 11:59:36,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:38,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:40,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 11:59:41,350 INFO [train.py:1039] (0/4) Epoch 1, batch 1600, loss[loss=0.5188, simple_loss=0.4682, pruned_loss=0.2901, over 23642.00 frames. ], tot_loss[loss=0.4895, simple_loss=0.453, pruned_loss=0.272, over 4715120.38 frames. ], batch size: 256, lr: 4.45e-02, grad_scale: 16.0 2023-09-28 11:59:42,205 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.40 vs. limit=4.6 2023-09-28 11:59:43,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:44,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:44,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:59:44,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:59:44,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:59:49,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:49,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 11:59:51,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 11:59:55,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 11:59:56,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:59:58,016 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.152e+02 3.597e+02 5.871e+02 8.452e+02 2.438e+03, threshold=1.174e+03, percent-clipped=11.0 2023-09-28 11:59:58,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 11:59:58,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:59:58,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=10733.333333333334, ans=0.008536231884057971 2023-09-28 12:00:02,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:00:05,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:00:08,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 12:00:11,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:00:13,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 12:00:13,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:14,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 12:00:17,312 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.72 vs. limit=11.55 2023-09-28 12:00:19,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 12:00:21,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=10800.0, ans=0.02166666666666667 2023-09-28 12:00:28,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:28,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 12:00:28,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:28,736 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=10800.0, ans=0.192 2023-09-28 12:00:28,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=10800.0, ans=0.192 2023-09-28 12:00:30,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:00:30,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:00:35,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:00:39,352 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=11.575 2023-09-28 12:00:40,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:00:40,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:00:40,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=10866.666666666666, ans=0.5196666666666667 2023-09-28 12:00:41,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:41,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:43,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:00:45,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:00:45,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:00:48,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:00:55,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:55,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:00:57,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 12:00:57,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:00:59,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 12:01:04,498 INFO [train.py:1039] (0/4) Epoch 1, batch 1650, loss[loss=0.4587, simple_loss=0.4486, pruned_loss=0.2316, over 23937.00 frames. ], tot_loss[loss=0.481, simple_loss=0.4483, pruned_loss=0.2636, over 4715971.31 frames. ], batch size: 80, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 12:01:04,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:08,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:08,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:01:08,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 12:01:08,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 12:01:08,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 12:01:09,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 12:01:12,299 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:01:14,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:01:16,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:16,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:01:16,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:01:18,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:21,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 12:01:22,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:01:22,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:22,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:01:24,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:01:24,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 12:01:25,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 12:01:33,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:01:34,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:01:42,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 12:01:43,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:47,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 12:01:48,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:01:49,317 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=11133.333333333334, ans=0.125 2023-09-28 12:01:49,596 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.58 vs. limit=7.783333333333333 2023-09-28 12:01:50,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:01:50,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:01:52,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:01:53,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:55,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:59,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:01,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:01,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:03,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:02:06,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:06,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 12:02:06,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=11200.0, ans=0.125 2023-09-28 12:02:08,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:08,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 12:02:09,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 12:02:09,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 12:02:11,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:13,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:02:13,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:13,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:02:13,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 12:02:18,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:20,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:02:20,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:24,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 12:02:28,769 INFO [train.py:1039] (0/4) Epoch 1, batch 1700, loss[loss=0.3777, simple_loss=0.3536, pruned_loss=0.2016, over 23482.00 frames. ], tot_loss[loss=0.4695, simple_loss=0.4409, pruned_loss=0.254, over 4710174.75 frames. ], batch size: 285, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:02:28,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:28,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:02:29,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 12:02:30,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:30,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:02:30,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:33,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:02:33,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:02:33,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 12:02:37,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:02:45,392 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.253e+02 3.835e+02 6.904e+02 1.046e+03 2.238e+03, threshold=1.381e+03, percent-clipped=16.0 2023-09-28 12:02:45,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:49,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:02:55,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=11400.0, ans=0.5010000000000001 2023-09-28 12:02:57,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:02:57,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:02:59,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:59,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:02,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 12:03:05,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:03:05,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:06,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:03:08,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:03:10,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 12:03:10,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 12:03:10,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=11466.666666666666, ans=0.125 2023-09-28 12:03:10,812 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.04 vs. limit=16.1 2023-09-28 12:03:12,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:13,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 12:03:15,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:03:24,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:24,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:24,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:03:26,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:03:26,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 12:03:27,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:29,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:29,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 12:03:31,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:03:31,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:31,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:31,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:34,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:34,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:03:36,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:36,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:03:36,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:41,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:41,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 12:03:44,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:46,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:47,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 12:03:52,710 INFO [train.py:1039] (0/4) Epoch 1, batch 1750, loss[loss=0.4381, simple_loss=0.4125, pruned_loss=0.2322, over 23806.00 frames. ], tot_loss[loss=0.4599, simple_loss=0.435, pruned_loss=0.2459, over 4719273.22 frames. ], batch size: 164, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:03:56,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:57,376 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.40 vs. limit=16.25 2023-09-28 12:03:58,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:59,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:04:01,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 12:04:01,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:04:04,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:04:05,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:08,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 12:04:11,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:12,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 12:04:14,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:16,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:04:19,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:04:21,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 12:04:22,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:04:22,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 12:04:33,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:04:34,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:04:34,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:40,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:40,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:41,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:04:43,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:45,438 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=7.42 vs. limit=8.746666666666666 2023-09-28 12:04:46,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:47,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:47,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 12:04:49,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:51,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 12:04:53,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:04:54,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:56,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:05:00,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:05:01,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 12:05:03,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:06,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:05:11,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:12,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:14,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=11933.333333333334, ans=0.09899494936611666 2023-09-28 12:05:15,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:05:15,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 12:05:15,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:17,320 INFO [train.py:1039] (0/4) Epoch 1, batch 1800, loss[loss=0.4522, simple_loss=0.439, pruned_loss=0.2316, over 23386.00 frames. ], tot_loss[loss=0.4532, simple_loss=0.4308, pruned_loss=0.2402, over 4710070.79 frames. ], batch size: 93, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:05:17,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:05:17,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:17,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:05:17,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:05:17,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:05:20,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:05:22,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:24,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:05:26,726 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.14 vs. limit=4.8 2023-09-28 12:05:27,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:30,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:05:32,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:05:33,423 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.239e+02 3.495e+02 5.189e+02 7.461e+02 1.869e+03, threshold=1.038e+03, percent-clipped=4.0 2023-09-28 12:05:35,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:36,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:39,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:39,757 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=8.826666666666666 2023-09-28 12:05:41,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:05:42,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:42,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 12:05:42,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=12066.666666666666, ans=0.17933333333333334 2023-09-28 12:05:44,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:48,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:49,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=12133.333333333334, ans=0.125 2023-09-28 12:05:52,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 12:05:54,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 12:05:54,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 12:05:55,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:55,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:55,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:57,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:06:05,674 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 12:06:07,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:06:08,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:10,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 12:06:10,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 12:06:11,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:06:14,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:06:14,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:06:19,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 12:06:27,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:06:29,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 12:06:29,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:06:29,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:29,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:06:30,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 12:06:32,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=12266.666666666666, ans=0.125 2023-09-28 12:06:34,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:06:34,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:06:37,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 12:06:37,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:38,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=12266.666666666666, ans=0.015555555555555559 2023-09-28 12:06:39,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:40,779 INFO [train.py:1039] (0/4) Epoch 1, batch 1850, loss[loss=0.375, simple_loss=0.3823, pruned_loss=0.1818, over 24322.00 frames. ], tot_loss[loss=0.4468, simple_loss=0.4273, pruned_loss=0.2348, over 4714594.37 frames. ], batch size: 56, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:06:40,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:06:40,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:42,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:43,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:06:44,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:06:44,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:48,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:06:48,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:06:50,014 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:06:56,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:06:56,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 12:07:00,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 12:07:04,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 12:07:07,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:07,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 12:07:07,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 12:07:17,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:07:19,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 12:07:23,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:07:23,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:24,424 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.09 vs. limit=5.0 2023-09-28 12:07:29,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 12:07:29,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:29,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:07:31,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:07:35,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:07:37,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:07:40,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:07:40,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:40,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:07:40,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:43,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:43,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:07:46,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 12:07:46,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:50,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=12600.0, ans=0.09899494936611666 2023-09-28 12:07:51,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:07:53,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:07:53,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 12:07:53,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 12:07:55,372 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 12:07:55,494 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 12:07:57,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:07:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:59,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:07:59,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:00,568 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 12:08:00,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:08:01,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:03,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:08:04,762 INFO [train.py:1039] (0/4) Epoch 1, batch 1900, loss[loss=0.4341, simple_loss=0.4173, pruned_loss=0.2251, over 23648.00 frames. ], tot_loss[loss=0.4408, simple_loss=0.4244, pruned_loss=0.2297, over 4722461.53 frames. ], batch size: 256, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:08:04,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:08:06,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:08:06,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 12:08:08,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:08,115 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 12:08:08,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:08:09,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:08:18,003 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 12:08:18,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 12:08:20,937 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.193e+02 3.536e+02 5.623e+02 9.146e+02 3.125e+03, threshold=1.125e+03, percent-clipped=17.0 2023-09-28 12:08:21,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:08:21,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:08:21,241 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 12:08:22,620 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 12:08:29,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 12:08:31,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:08:36,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 12:08:38,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 12:08:43,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=12800.0, ans=0.09899494936611666 2023-09-28 12:08:48,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 12:08:51,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 12:08:51,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:52,839 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 12:08:52,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 12:08:52,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 12:08:54,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 12:08:54,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:08:57,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 12:09:00,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:09:05,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:05,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 12:09:08,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:09:10,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=12933.333333333334, ans=0.125 2023-09-28 12:09:11,844 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=12933.333333333334, ans=0.17066666666666666 2023-09-28 12:09:13,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 12:09:13,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:13,444 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=12933.333333333334, ans=0.125 2023-09-28 12:09:20,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:09:20,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:09:20,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:09:20,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:09:23,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=12933.333333333334, ans=0.125 2023-09-28 12:09:24,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:09:24,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:09:24,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:09:24,826 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:09:27,551 INFO [train.py:1039] (0/4) Epoch 1, batch 1950, loss[loss=0.4073, simple_loss=0.3892, pruned_loss=0.2126, over 23763.00 frames. ], tot_loss[loss=0.4365, simple_loss=0.4223, pruned_loss=0.2261, over 4727011.77 frames. ], batch size: 164, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:09:27,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:27,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:09:30,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:09:30,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:30,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:32,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:34,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:37,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:09:37,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:37,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:09:37,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=13000.0, ans=0.16999999999999998 2023-09-28 12:09:42,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 12:09:42,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:09:42,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:44,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:47,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:09:47,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:09:47,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:50,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:09:53,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:53,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:09:53,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:09:53,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:58,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:59,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=13133.333333333334, ans=0.125 2023-09-28 12:10:00,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:10:00,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:01,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:10:01,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 12:10:03,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:10:03,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:10:03,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:08,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:11,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:10:13,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:10:17,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:10:19,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:10:19,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 12:10:20,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:21,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=13200.0, ans=0.125 2023-09-28 12:10:25,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:10:25,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:10:26,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:34,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:37,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:38,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:40,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:42,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:10:42,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:43,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 12:10:43,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:10:44,466 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.14 vs. limit=17.45 2023-09-28 12:10:45,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:47,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 12:10:50,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=13333.333333333334, ans=0.011111111111111106 2023-09-28 12:10:51,424 INFO [train.py:1039] (0/4) Epoch 1, batch 2000, loss[loss=0.3701, simple_loss=0.3757, pruned_loss=0.1823, over 24475.00 frames. ], tot_loss[loss=0.4335, simple_loss=0.4217, pruned_loss=0.2231, over 4710573.30 frames. ], batch size: 58, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:10:51,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:10:56,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:56,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:10:57,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:57,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:11:00,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:05,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 12:11:05,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:11:06,957 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.094e+02 3.925e+02 5.056e+02 7.202e+02 2.152e+03, threshold=1.011e+03, percent-clipped=10.0 2023-09-28 12:11:08,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:11:09,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=13400.0, ans=0.125 2023-09-28 12:11:10,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 12:11:12,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:11:12,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:11:15,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:11:15,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 12:11:15,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=13400.0, ans=0.43100000000000005 2023-09-28 12:11:16,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:18,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:18,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:20,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 12:11:20,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:11:22,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 12:11:22,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:23,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=13466.666666666666, ans=0.125 2023-09-28 12:11:25,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=13466.666666666666, ans=0.125 2023-09-28 12:11:28,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:11:29,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:11:29,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:31,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:32,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:32,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 12:11:35,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 12:11:35,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:35,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:11:40,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:40,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=13533.333333333334, ans=0.0 2023-09-28 12:11:42,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:11:42,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:44,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:11:46,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:46,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:47,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:47,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:49,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:52,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:52,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=13533.333333333334, ans=0.010277777777777775 2023-09-28 12:11:53,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 12:11:55,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=13600.0, ans=0.0 2023-09-28 12:12:01,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:12:03,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:12:09,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:11,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:11,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:12,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:12:12,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:12:14,463 INFO [train.py:1039] (0/4) Epoch 1, batch 2050, loss[loss=0.4131, simple_loss=0.4266, pruned_loss=0.1998, over 24317.00 frames. ], tot_loss[loss=0.4308, simple_loss=0.4201, pruned_loss=0.2212, over 4703134.67 frames. ], batch size: 77, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:12:14,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:15,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:19,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:19,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:24,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:12:25,227 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.75 vs. limit=17.75 2023-09-28 12:12:26,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:12:26,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:27,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:12:31,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 12:12:31,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:12:34,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:12:34,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:12:42,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:42,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:43,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 12:12:47,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:47,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=13800.0, ans=0.41700000000000004 2023-09-28 12:12:49,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 12:12:50,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:52,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.whiten.whitening_limit, batch_count=13800.0, ans=9.52 2023-09-28 12:12:53,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:55,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:12:55,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:12:56,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:56,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:12:58,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:12:59,140 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.99 vs. limit=12.675 2023-09-28 12:12:59,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:13:03,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:05,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:13:07,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:13:09,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:13:12,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:20,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:13:20,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 12:13:25,752 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.18 vs. limit=11.966666666666667 2023-09-28 12:13:26,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:27,627 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=16.11 vs. limit=12.725 2023-09-28 12:13:28,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:13:29,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:13:32,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 12:13:35,616 INFO [train.py:1039] (0/4) Epoch 1, batch 2100, loss[loss=0.3621, simple_loss=0.3796, pruned_loss=0.1723, over 24613.00 frames. ], tot_loss[loss=0.4237, simple_loss=0.415, pruned_loss=0.2165, over 4692158.00 frames. ], batch size: 60, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:13:36,080 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=14000.0, ans=0.125 2023-09-28 12:13:37,968 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 12:13:37,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:38,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:38,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:13:39,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:39,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 12:13:41,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 12:13:42,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=14000.0, ans=0.07 2023-09-28 12:13:43,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:47,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:13:47,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:13:49,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:49,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=14000.0, ans=0.008333333333333338 2023-09-28 12:13:51,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:13:51,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 12:13:52,526 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.170e+02 3.843e+02 5.173e+02 8.078e+02 2.053e+03, threshold=1.035e+03, percent-clipped=17.0 2023-09-28 12:13:52,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:13:52,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 12:13:52,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 12:13:54,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:13:54,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:13:54,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 12:13:56,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:14:01,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 12:14:01,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:14:04,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:14:04,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:14:08,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:14:08,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 12:14:09,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:09,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 12:14:12,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 12:14:12,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:13,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 12:14:13,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 12:14:13,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 12:14:16,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:14:19,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:14:21,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:23,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:24,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:27,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:27,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 12:14:27,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:27,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:29,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:29,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 12:14:31,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 12:14:32,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 12:14:37,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:14:39,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:14:39,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 12:14:46,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:48,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:14:49,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:14:49,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:14:49,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 12:14:51,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:14:52,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:52,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:14:54,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:14:54,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:54,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=14266.666666666666, ans=0.125 2023-09-28 12:14:56,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 12:14:58,991 INFO [train.py:1039] (0/4) Epoch 1, batch 2150, loss[loss=0.3622, simple_loss=0.3819, pruned_loss=0.1712, over 24653.00 frames. ], tot_loss[loss=0.4182, simple_loss=0.4127, pruned_loss=0.212, over 4705217.36 frames. ], batch size: 65, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:14:59,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 12:14:59,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:02,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:02,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:15:02,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:15:03,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:15:10,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:15:10,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:11,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:13,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:15:13,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:15,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:15:18,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=14400.0, ans=0.125 2023-09-28 12:15:20,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:20,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:15:20,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:15:27,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:27,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 12:15:31,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:31,848 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.12 vs. limit=12.925 2023-09-28 12:15:32,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:15:34,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:34,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:35,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:35,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:15:37,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:37,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:15:37,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:15:39,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 12:15:40,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:15:41,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:42,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:42,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:15:44,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:15:47,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:47,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:15:47,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=14533.333333333334, ans=0.125 2023-09-28 12:15:49,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:49,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 12:15:49,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:15:53,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:53,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:56,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:56,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:15:56,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:59,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:59,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 12:16:01,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 12:16:01,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:16:01,441 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 12:16:02,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:04,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:16:04,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 12:16:04,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:16:05,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 12:16:05,921 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 12:16:05,921 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 12:16:05,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 12:16:08,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:10,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:16:10,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:16:10,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:12,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:16:13,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:13,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:19,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=14600.0, ans=0.125 2023-09-28 12:16:21,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:16:21,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 12:16:21,778 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=14666.666666666666, ans=0.125 2023-09-28 12:16:22,766 INFO [train.py:1039] (0/4) Epoch 1, batch 2200, loss[loss=0.4083, simple_loss=0.4246, pruned_loss=0.196, over 24653.00 frames. ], tot_loss[loss=0.414, simple_loss=0.4108, pruned_loss=0.2089, over 4710697.20 frames. ], batch size: 73, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:16:24,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:16:26,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=14666.666666666666, ans=0.035 2023-09-28 12:16:31,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:32,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:16:33,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:16:33,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:16:36,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:37,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:16:37,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 12:16:39,265 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 4.143e+02 6.351e+02 9.037e+02 1.826e+03, threshold=1.270e+03, percent-clipped=17.0 2023-09-28 12:16:41,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 12:16:44,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:16:50,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 12:16:52,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:54,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:16:56,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:17:00,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:17:00,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 12:17:01,156 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.89 vs. limit=13.05 2023-09-28 12:17:06,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:17:08,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:08,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 12:17:11,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:17:13,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:16,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:17:16,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=14866.666666666666, ans=0.025 2023-09-28 12:17:17,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:20,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 12:17:22,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:23,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 12:17:25,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:25,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:17:25,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:27,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:17:29,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:29,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:29,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:30,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:17:32,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:17:33,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:17:35,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:17:37,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:17:39,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:17:41,082 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 12:17:44,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:17:44,954 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 12:17:46,347 INFO [train.py:1039] (0/4) Epoch 1, batch 2250, loss[loss=0.3913, simple_loss=0.3855, pruned_loss=0.1985, over 23384.00 frames. ], tot_loss[loss=0.4105, simple_loss=0.4096, pruned_loss=0.2058, over 4715368.09 frames. ], batch size: 285, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:17:46,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:17:46,505 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 12:17:47,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:48,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:17:49,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:51,300 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 12:17:51,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:17:52,191 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.02 vs. limit=5.25 2023-09-28 12:17:54,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:18:00,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:18:00,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=15066.666666666666, ans=0.125 2023-09-28 12:18:04,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:18:05,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:07,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:07,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:18:09,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=15066.666666666666, ans=0.125 2023-09-28 12:18:11,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 12:18:11,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:11,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:18:14,150 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.80 vs. limit=8.766666666666666 2023-09-28 12:18:14,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 12:18:14,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:18:14,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:18,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:23,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:24,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:18:26,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:18:26,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 12:18:27,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:31,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:18:32,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:34,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:36,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:18:36,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:39,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:39,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:18:45,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:18:47,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:18:52,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:18:52,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:18:53,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:18:59,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:19:02,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:19:02,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 12:19:02,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:04,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:19:05,323 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=13.225 2023-09-28 12:19:07,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 12:19:09,227 INFO [train.py:1039] (0/4) Epoch 1, batch 2300, loss[loss=0.3741, simple_loss=0.4034, pruned_loss=0.1723, over 24642.00 frames. ], tot_loss[loss=0.408, simple_loss=0.4086, pruned_loss=0.2038, over 4705939.14 frames. ], batch size: 65, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:19:11,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:19:11,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:12,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=15333.333333333334, ans=0.07 2023-09-28 12:19:16,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:16,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:19:20,870 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 12:19:24,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:25,389 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.05 vs. limit=5.3 2023-09-28 12:19:27,595 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.211e+02 3.558e+02 5.040e+02 6.600e+02 1.327e+03, threshold=1.008e+03, percent-clipped=3.0 2023-09-28 12:19:30,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:19:30,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:19:31,642 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.34 vs. limit=19.05 2023-09-28 12:19:32,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:19:32,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:32,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 12:19:35,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:19:37,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:19:37,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:19:41,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:19:43,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:19:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:19:53,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:19:53,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:57,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:20:00,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:02,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:20:03,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:20:03,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:20:04,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 12:20:07,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:20:07,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:08,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:08,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:20:08,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:10,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:20:10,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:20:10,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 12:20:10,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:20:10,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:11,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 12:20:14,538 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=15533.333333333334, ans=0.0 2023-09-28 12:20:17,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:20:21,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:20:26,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:26,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:20:28,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:20:31,206 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=15600.0, ans=0.14400000000000002 2023-09-28 12:20:32,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:20:32,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:20:33,937 INFO [train.py:1039] (0/4) Epoch 1, batch 2350, loss[loss=0.4053, simple_loss=0.415, pruned_loss=0.1978, over 23382.00 frames. ], tot_loss[loss=0.4093, simple_loss=0.4097, pruned_loss=0.2046, over 4685791.38 frames. ], batch size: 93, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:20:34,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:20:34,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 12:20:34,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=15666.666666666666, ans=0.125 2023-09-28 12:20:34,858 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.33 vs. limit=12.833333333333332 2023-09-28 12:20:39,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:20:39,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 12:20:44,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=15666.666666666666, ans=0.3516666666666667 2023-09-28 12:20:45,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 12:20:49,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:54,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:54,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:55,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:20:56,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:56,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 12:20:58,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:21:01,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 12:21:06,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:21:09,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:21:09,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:21:12,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:21:12,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 12:21:12,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:21:15,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:21:16,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:17,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:21:21,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:21:23,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 12:21:24,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:21:26,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:21:26,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:21:28,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 12:21:30,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:21:33,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 12:21:33,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:21:36,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 12:21:41,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 12:21:41,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:41,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 12:21:43,279 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 12:21:43,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 12:21:44,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 12:21:47,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:21:54,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:21:55,463 INFO [train.py:1039] (0/4) Epoch 1, batch 2400, loss[loss=0.3758, simple_loss=0.4075, pruned_loss=0.172, over 23983.00 frames. ], tot_loss[loss=0.4041, simple_loss=0.4066, pruned_loss=0.2009, over 4681667.20 frames. ], batch size: 86, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:21:59,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:21:59,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:21:59,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=16000.0, ans=0.125 2023-09-28 12:22:01,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 12:22:01,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 12:22:06,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=16000.0, ans=0.125 2023-09-28 12:22:09,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:22:09,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:09,306 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=16000.0, ans=0.14 2023-09-28 12:22:13,391 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.157e+02 3.788e+02 5.121e+02 7.907e+02 1.984e+03, threshold=1.024e+03, percent-clipped=10.0 2023-09-28 12:22:13,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 12:22:13,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:22:15,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:15,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 12:22:21,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:21,876 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.55 vs. limit=19.55 2023-09-28 12:22:24,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 12:22:27,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:22:32,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 12:22:37,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:22:38,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:43,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:22:45,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 12:22:45,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:22:52,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:54,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:22:55,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:57,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:22:57,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:22:57,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:22:57,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:57,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:22:57,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:22:59,599 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=16200.0, ans=0.3330000000000001 2023-09-28 12:23:02,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:02,649 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:23:03,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:23:03,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 12:23:05,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 12:23:07,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:23:07,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:23:08,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 12:23:09,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 12:23:09,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 12:23:09,029 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 12:23:12,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 12:23:12,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:23:12,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=16266.666666666666, ans=0.13733333333333334 2023-09-28 12:23:14,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:14,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:15,578 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 12:23:17,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:17,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:23:18,596 INFO [train.py:1039] (0/4) Epoch 1, batch 2450, loss[loss=0.4016, simple_loss=0.4127, pruned_loss=0.1952, over 23190.00 frames. ], tot_loss[loss=0.3974, simple_loss=0.4023, pruned_loss=0.1964, over 4681846.78 frames. ], batch size: 93, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:23:21,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:23:21,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:25,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:25,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:23:27,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 12:23:31,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:23:33,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:35,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:23:36,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:23:36,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:23:36,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 12:23:42,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:44,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:23:44,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:49,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:23:51,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:51,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:52,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:54,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 12:23:57,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:24:01,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=16466.666666666668, ans=0.125 2023-09-28 12:24:04,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:06,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:24:06,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:07,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:24:07,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:09,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:24:09,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 12:24:12,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:24:14,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:24:18,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:24:18,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:21,715 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.02 vs. limit=13.7 2023-09-28 12:24:22,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:24:24,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 12:24:24,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:24:26,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:24:26,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 12:24:26,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:24:27,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:24:32,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:24:34,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:34,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=16600.0, ans=0.31900000000000006 2023-09-28 12:24:35,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:24:37,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=16600.0, ans=0.31900000000000006 2023-09-28 12:24:39,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 12:24:40,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:24:42,483 INFO [train.py:1039] (0/4) Epoch 1, batch 2500, loss[loss=0.4182, simple_loss=0.4304, pruned_loss=0.203, over 23893.00 frames. ], tot_loss[loss=0.3936, simple_loss=0.4, pruned_loss=0.1937, over 4687522.04 frames. ], batch size: 86, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:24:47,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:24:57,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:24:58,651 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.092e+02 3.311e+02 4.772e+02 6.840e+02 1.468e+03, threshold=9.543e+02, percent-clipped=7.0 2023-09-28 12:24:58,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:25:00,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:25:00,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 12:25:08,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:25:08,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:09,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:25:09,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:25:10,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 12:25:12,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:14,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 12:25:14,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 12:25:16,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:16,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=16800.0, ans=0.125 2023-09-28 12:25:20,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:25:22,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:22,607 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=16800.0, ans=0.125 2023-09-28 12:25:24,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:25:24,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 12:25:26,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:25:29,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:32,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:37,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:38,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=16866.666666666668, ans=0.1313333333333333 2023-09-28 12:25:40,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:45,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:25:45,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=16866.666666666668, ans=0.1313333333333333 2023-09-28 12:25:46,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 12:25:47,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=16866.666666666668, ans=0.0 2023-09-28 12:25:48,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:48,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:25:50,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:25:50,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:25:50,196 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 12:25:50,197 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 12:25:50,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 12:25:54,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:56,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 12:25:56,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 12:25:56,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=16933.333333333332, ans=0.0 2023-09-28 12:25:57,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:59,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 12:26:02,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 12:26:05,231 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:26:06,338 INFO [train.py:1039] (0/4) Epoch 1, batch 2550, loss[loss=0.3612, simple_loss=0.3949, pruned_loss=0.1638, over 24454.00 frames. ], tot_loss[loss=0.391, simple_loss=0.3989, pruned_loss=0.1916, over 4688313.75 frames. ], batch size: 69, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:26:06,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:06,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:26:07,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=17000.0, ans=9.25 2023-09-28 12:26:08,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:26:09,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:11,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 12:26:11,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:26:16,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 12:26:18,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:26:20,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:21,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:26:21,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 12:26:23,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:23,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:23,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:23,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=17066.666666666668, ans=0.0 2023-09-28 12:26:27,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:26:27,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 12:26:27,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:26:27,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:27,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 12:26:28,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=17066.666666666668, ans=0.0 2023-09-28 12:26:41,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:26:47,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:26:47,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:47,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:49,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:26:55,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:55,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=17200.0, ans=0.128 2023-09-28 12:26:58,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:58,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:26:58,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:26:59,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:26:59,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:27:02,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:04,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:09,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:27:09,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 12:27:09,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:27:09,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:11,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:27:13,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:27:13,880 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.46 vs. limit=5.59 2023-09-28 12:27:14,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:21,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:27:23,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:26,871 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 12:27:30,285 INFO [train.py:1039] (0/4) Epoch 1, batch 2600, loss[loss=0.3644, simple_loss=0.3809, pruned_loss=0.174, over 23567.00 frames. ], tot_loss[loss=0.3891, simple_loss=0.3982, pruned_loss=0.19, over 4701115.09 frames. ], batch size: 134, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:27:31,798 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 12:27:31,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:27:31,883 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 12:27:33,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 12:27:33,439 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 12:27:36,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:36,551 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 12:27:38,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 12:27:39,556 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 12:27:41,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:27:41,399 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=17333.333333333332, ans=0.125 2023-09-28 12:27:44,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 12:27:45,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 12:27:47,477 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.295e+02 3.359e+02 4.665e+02 7.266e+02 2.532e+03, threshold=9.331e+02, percent-clipped=13.0 2023-09-28 12:27:47,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:27:47,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 12:27:51,253 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 12:27:51,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 12:28:01,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:01,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:01,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:01,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 12:28:03,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:28:08,369 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=17466.666666666668, ans=0.125 2023-09-28 12:28:09,687 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 12:28:14,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:15,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:15,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 12:28:17,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:17,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:17,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 12:28:22,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:28:22,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:28:25,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:27,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=17533.333333333332, ans=0.125 2023-09-28 12:28:29,240 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 12:28:29,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:29,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:28:36,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:36,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:28:36,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 12:28:38,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:39,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:28:41,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:28:46,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 12:28:47,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:47,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:28:48,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=17600.0, ans=0.125 2023-09-28 12:28:52,392 INFO [train.py:1039] (0/4) Epoch 1, batch 2650, loss[loss=0.3687, simple_loss=0.3945, pruned_loss=0.1715, over 24324.00 frames. ], tot_loss[loss=0.3854, simple_loss=0.3964, pruned_loss=0.1872, over 4702802.78 frames. ], batch size: 61, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:28:53,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 12:28:53,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:54,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:28:54,115 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 12:28:54,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:28:57,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:01,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:29:01,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:29:05,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:29:06,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 12:29:06,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:29:06,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:29:09,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 12:29:11,481 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 12:29:14,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:14,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=17733.333333333332, ans=0.0 2023-09-28 12:29:16,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 12:29:16,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:16,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 12:29:20,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:20,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:29:22,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:22,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:25,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=17800.0, ans=0.12200000000000003 2023-09-28 12:29:29,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 12:29:29,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 12:29:34,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:29:37,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 12:29:37,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:39,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:39,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:29:41,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:41,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:41,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=17866.666666666668, ans=0.0069855072463768115 2023-09-28 12:29:43,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:43,520 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=17866.666666666668, ans=0.125 2023-09-28 12:29:46,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:47,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:47,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:29:48,010 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=17866.666666666668, ans=0.125 2023-09-28 12:29:49,450 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=17866.666666666668, ans=0.0 2023-09-28 12:29:50,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:29:50,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=17866.666666666668, ans=0.0713333333333333 2023-09-28 12:29:51,253 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.01 vs. limit=9.466666666666667 2023-09-28 12:29:52,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:53,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:29:53,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:55,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:55,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:29:59,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:59,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:29:59,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:01,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 12:30:04,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:04,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:06,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:08,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:08,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=17933.333333333332, ans=0.0 2023-09-28 12:30:10,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:30:10,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:13,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:13,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 12:30:15,498 INFO [train.py:1039] (0/4) Epoch 1, batch 2700, loss[loss=0.3544, simple_loss=0.384, pruned_loss=0.1624, over 24657.00 frames. ], tot_loss[loss=0.3849, simple_loss=0.3974, pruned_loss=0.1863, over 4710019.36 frames. ], batch size: 65, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:30:17,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:30:19,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:30:22,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:30:22,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:22,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:23,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:30:23,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:23,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:30:25,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:30:25,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 12:30:25,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:30:25,771 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=18000.0, ans=0.006956521739130435 2023-09-28 12:30:27,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:30:28,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:30:30,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:32,941 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.245e+02 3.486e+02 4.470e+02 6.707e+02 1.380e+03, threshold=8.939e+02, percent-clipped=9.0 2023-09-28 12:30:33,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:30:36,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 12:30:36,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:30:39,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=18066.666666666668, ans=0.125 2023-09-28 12:30:41,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:30:41,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:30:46,996 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=18133.333333333332, ans=0.1186666666666667 2023-09-28 12:30:48,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:30:48,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:49,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:30:49,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:30:50,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=18133.333333333332, ans=0.125 2023-09-28 12:30:52,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:30:55,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:55,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:30:55,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:31:00,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:00,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:31:10,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:31:10,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:31:16,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:31:16,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:18,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=18266.666666666668, ans=0.125 2023-09-28 12:31:22,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:22,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:24,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:31:24,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:26,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:26,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:31:27,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:31:31,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:31,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:34,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 12:31:35,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:37,366 INFO [train.py:1039] (0/4) Epoch 1, batch 2750, loss[loss=0.3746, simple_loss=0.4089, pruned_loss=0.1701, over 24028.00 frames. ], tot_loss[loss=0.3831, simple_loss=0.3962, pruned_loss=0.185, over 4705491.17 frames. ], batch size: 80, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:31:37,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:31:37,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 12:31:39,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 12:31:39,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:43,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:31:43,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:45,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:45,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:31:45,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:47,849 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.63 vs. limit=11.333333333333332 2023-09-28 12:31:50,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:31:50,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:31:50,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:31:50,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:50,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 12:31:50,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:31:52,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:59,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 12:32:02,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:32:02,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:03,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:03,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:32:05,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:32:07,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:32:07,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:07,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:12,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:32:12,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:32:12,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:32:13,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:15,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:32:21,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:24,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:32:24,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:29,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=18533.333333333332, ans=0.125 2023-09-28 12:32:30,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:30,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:32:30,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:32:31,483 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.20 vs. limit=14.45 2023-09-28 12:32:37,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:32:37,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:37,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 12:32:43,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:45,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 12:32:50,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:32:53,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:32:53,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 12:32:54,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:32:56,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:32:58,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 12:32:58,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:32:59,459 INFO [train.py:1039] (0/4) Epoch 1, batch 2800, loss[loss=0.3822, simple_loss=0.391, pruned_loss=0.1867, over 23744.00 frames. ], tot_loss[loss=0.3789, simple_loss=0.3927, pruned_loss=0.1826, over 4701163.56 frames. ], batch size: 164, lr: 4.36e-02, grad_scale: 32.0 2023-09-28 12:33:02,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:33:02,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:02,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:04,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 12:33:04,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:04,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:05,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:05,733 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 12:33:05,734 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 12:33:10,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:11,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:33:11,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:33:17,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:33:18,631 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 3.169e+02 4.499e+02 7.440e+02 2.031e+03, threshold=8.997e+02, percent-clipped=14.0 2023-09-28 12:33:18,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 12:33:20,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:33:22,652 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=11.493333333333332 2023-09-28 12:33:23,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 12:33:24,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:24,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:33:24,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:29,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:31,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:31,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:33:31,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:33:39,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:33:40,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=18800.0, ans=0.125 2023-09-28 12:33:41,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:44,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:44,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:33:46,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:50,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:33:50,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 12:33:52,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:52,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:52,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:33:52,979 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=18866.666666666668, ans=0.006768115942028985 2023-09-28 12:33:58,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:58,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:03,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:34:03,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=18933.333333333332, ans=0.2373333333333334 2023-09-28 12:34:04,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:34:04,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:04,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:34:05,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:34:07,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:34:09,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:34:09,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 12:34:09,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:10,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:34:10,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:12,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 12:34:14,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:14,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:34:14,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:34:16,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 12:34:22,209 INFO [train.py:1039] (0/4) Epoch 1, batch 2850, loss[loss=0.4001, simple_loss=0.4228, pruned_loss=0.1887, over 24336.00 frames. ], tot_loss[loss=0.3755, simple_loss=0.3894, pruned_loss=0.1808, over 4669144.66 frames. ], batch size: 74, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:34:22,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:34:22,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:34:22,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:34:24,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:29,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:34:29,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:34:29,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:34:32,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:33,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:34,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:34:35,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 12:34:41,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 12:34:41,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:43,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 12:34:43,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:46,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 12:34:47,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 12:34:48,649 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=15.54 vs. limit=14.65 2023-09-28 12:34:49,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:00,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:03,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:03,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:35:05,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:35:05,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:35:06,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:35:08,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:35:09,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 12:35:12,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:35:14,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:14,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:15,170 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=19200.0, ans=0.125 2023-09-28 12:35:16,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:19,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:19,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:20,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:22,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:25,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:35:25,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:25,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:26,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:35:30,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=19266.666666666668, ans=0.125 2023-09-28 12:35:33,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:35:35,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 12:35:35,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 12:35:36,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:35:36,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:38,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 12:35:38,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:35:39,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:39,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:40,103 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=19266.666666666668, ans=0.125 2023-09-28 12:35:41,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:35:41,366 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 12:35:41,436 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 12:35:41,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:35:41,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:44,382 INFO [train.py:1039] (0/4) Epoch 1, batch 2900, loss[loss=0.3224, simple_loss=0.3526, pruned_loss=0.1461, over 24453.00 frames. ], tot_loss[loss=0.372, simple_loss=0.3881, pruned_loss=0.1779, over 4673084.55 frames. ], batch size: 58, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:35:46,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:35:48,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:48,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:50,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 12:35:53,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:55,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 12:35:55,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 12:35:55,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=19333.333333333332, ans=0.006666666666666667 2023-09-28 12:35:57,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:35:57,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:36:00,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:00,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:36:01,160 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.25 vs. limit=14.775 2023-09-28 12:36:03,246 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 3.297e+02 4.561e+02 6.852e+02 1.887e+03, threshold=9.123e+02, percent-clipped=12.0 2023-09-28 12:36:04,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:36:06,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:36:06,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:36:08,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 12:36:08,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:36:10,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:14,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 12:36:16,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 12:36:19,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:36:19,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 12:36:19,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:36:22,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:36:22,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:36:26,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:26,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:29,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=19466.666666666668, ans=0.21866666666666668 2023-09-28 12:36:31,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:36:32,039 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=19466.666666666668, ans=0.0 2023-09-28 12:36:33,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:36:33,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 12:36:35,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 12:36:35,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:36:39,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:36:43,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 12:36:44,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:36:48,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=19533.333333333332, ans=0.21633333333333338 2023-09-28 12:36:49,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:51,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=19600.0, ans=0.21399999999999997 2023-09-28 12:36:59,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:36:59,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:37:02,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 12:37:04,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:04,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 12:37:04,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:06,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:37:08,479 INFO [train.py:1039] (0/4) Epoch 1, batch 2950, loss[loss=0.3341, simple_loss=0.3581, pruned_loss=0.155, over 22403.00 frames. ], tot_loss[loss=0.3703, simple_loss=0.3881, pruned_loss=0.1763, over 4689800.49 frames. ], batch size: 49, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:37:13,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:14,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 12:37:16,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:16,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:18,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:37:19,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:37:21,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 12:37:21,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=19666.666666666668, ans=0.05 2023-09-28 12:37:22,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 12:37:24,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:37:24,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:26,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=19733.333333333332, ans=0.20933333333333337 2023-09-28 12:37:27,926 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=19733.333333333332, ans=0.10266666666666668 2023-09-28 12:37:29,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:31,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:34,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:37:34,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:37,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:37:38,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:37:39,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:41,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:41,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:37:44,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 12:37:48,358 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.09 vs. limit=5.970000000000001 2023-09-28 12:37:49,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 12:37:49,255 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 12:37:50,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:37:52,227 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 12:37:53,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 12:37:53,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:55,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:55,685 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 12:37:55,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:37:58,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 12:37:58,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:59,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:38:00,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=19866.666666666668, ans=0.125 2023-09-28 12:38:03,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:03,722 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=19866.666666666668, ans=0.006550724637681159 2023-09-28 12:38:04,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:38:04,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:04,868 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 12:38:04,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:04,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 12:38:12,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:13,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:15,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 12:38:15,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:38:17,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 12:38:19,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:22,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:38:22,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:38:23,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:23,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:38:23,854 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:38:25,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:38:26,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:26,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:38:26,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:38:28,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:30,183 INFO [train.py:1039] (0/4) Epoch 1, batch 3000, loss[loss=0.3684, simple_loss=0.3805, pruned_loss=0.1781, over 23887.00 frames. ], tot_loss[loss=0.3713, simple_loss=0.389, pruned_loss=0.1768, over 4697942.93 frames. ], batch size: 195, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:38:30,184 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 12:38:44,448 INFO [train.py:1071] (0/4) Epoch 1, validation: loss=0.4132, simple_loss=0.3632, pruned_loss=0.2317, over 1125622.00 frames. 2023-09-28 12:38:44,449 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 12:38:44,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:38:44,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:44,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 12:38:48,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:49,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:38:51,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:38:54,462 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 12:38:55,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 12:38:57,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:59,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:38:59,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 12:38:59,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:02,621 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.319e+02 3.597e+02 4.607e+02 6.753e+02 1.897e+03, threshold=9.214e+02, percent-clipped=10.0 2023-09-28 12:39:07,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:39:15,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:39:23,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 12:39:23,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:39:27,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:39:27,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:28,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:39:31,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:31,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 12:39:33,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 12:39:35,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:39:35,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:39:37,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:39:38,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:40,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:40,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:39:43,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:39:43,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:43,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:39:45,451 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=31.85 vs. limit=22.5 2023-09-28 12:39:46,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:46,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 12:39:47,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:39:49,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:39:49,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:39:54,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:55,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:56,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:39:57,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 12:39:57,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:39:57,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 12:39:59,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:39:59,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 12:40:00,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=20266.666666666668, ans=0.0064637681159420285 2023-09-28 12:40:02,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:03,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:40:03,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 12:40:05,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 12:40:05,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:40:05,859 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=17.77 vs. limit=22.5 2023-09-28 12:40:07,425 INFO [train.py:1039] (0/4) Epoch 1, batch 3050, loss[loss=0.3737, simple_loss=0.3952, pruned_loss=0.1761, over 23252.00 frames. ], tot_loss[loss=0.3705, simple_loss=0.389, pruned_loss=0.1761, over 4696152.19 frames. ], batch size: 105, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:40:07,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:40:09,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:40:09,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:40:09,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:10,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:40:11,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 12:40:12,872 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.17 vs. limit=15.0 2023-09-28 12:40:13,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:15,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:15,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:40:20,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:22,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 12:40:31,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 12:40:31,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 12:40:32,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:37,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:40:39,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:39,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:41,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:44,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:40:45,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:45,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:46,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:46,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:47,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:49,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:50,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:52,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 12:40:52,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:52,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:40:57,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:57,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:40:59,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:40:59,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:03,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=20533.333333333332, ans=0.006405797101449276 2023-09-28 12:41:05,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:41:05,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:12,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:12,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=20600.0, ans=0.2 2023-09-28 12:41:14,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:41:14,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:41:16,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:17,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:41:17,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:41:19,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 12:41:19,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:19,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:21,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 12:41:24,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:28,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:30,059 INFO [train.py:1039] (0/4) Epoch 1, batch 3100, loss[loss=0.3195, simple_loss=0.3535, pruned_loss=0.1427, over 24452.00 frames. ], tot_loss[loss=0.367, simple_loss=0.3863, pruned_loss=0.1739, over 4700240.56 frames. ], batch size: 63, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:41:32,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:41:35,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:41:37,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 12:41:40,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 12:41:40,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 12:41:42,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:41:47,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:41:47,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:48,537 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.389e+02 3.317e+02 4.517e+02 6.154e+02 1.203e+03, threshold=9.035e+02, percent-clipped=5.0 2023-09-28 12:41:50,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:41:54,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:54,777 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.84 vs. limit=10.0 2023-09-28 12:41:56,326 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.20 vs. limit=15.0 2023-09-28 12:41:58,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 12:42:03,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:42:04,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:04,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:04,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:04,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:42:07,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:42:07,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 12:42:07,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:42:07,759 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.22 vs. limit=15.0 2023-09-28 12:42:08,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:10,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 12:42:12,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:42:15,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:42:17,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 12:42:17,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 12:42:18,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:20,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:24,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:24,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:24,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:42:25,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:42:25,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:42:28,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:42:28,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:42:28,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:28,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 12:42:31,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:33,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 12:42:33,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=20866.666666666668, ans=0.125 2023-09-28 12:42:36,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:42:36,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 12:42:36,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:37,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:38,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 12:42:41,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=20933.333333333332, ans=0.2 2023-09-28 12:42:47,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=20933.333333333332, ans=0.125 2023-09-28 12:42:51,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 12:42:53,231 INFO [train.py:1039] (0/4) Epoch 1, batch 3150, loss[loss=0.3703, simple_loss=0.397, pruned_loss=0.1718, over 24069.00 frames. ], tot_loss[loss=0.3635, simple_loss=0.3835, pruned_loss=0.1717, over 4694012.36 frames. ], batch size: 80, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:42:53,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=21000.0, ans=0.0 2023-09-28 12:42:54,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:42:56,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:57,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:57,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:42:57,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 12:43:00,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:00,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:43:01,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 12:43:04,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:06,241 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 12:43:08,373 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.64 vs. limit=6.0 2023-09-28 12:43:09,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 12:43:10,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:43:12,197 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 12:43:13,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:43:13,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 12:43:15,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 12:43:15,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 12:43:15,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:15,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:16,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:17,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 12:43:18,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=21066.666666666668, ans=0.1 2023-09-28 12:43:21,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:21,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:23,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:24,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:43:28,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 12:43:28,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:43:31,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:43:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:33,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 12:43:34,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 12:43:36,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:43:36,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:43:36,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:43:37,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:37,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:43:38,142 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=21133.333333333332, ans=0.0 2023-09-28 12:43:39,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:43:39,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:43:39,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 12:43:40,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:43:40,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:42,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:43:42,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:42,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 12:43:44,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:45,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 12:43:45,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:47,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 12:43:50,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 12:43:53,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:43:53,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:54,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 12:43:56,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:43:56,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:58,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:44:01,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:01,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:44:05,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=21266.666666666668, ans=0.05 2023-09-28 12:44:06,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:44:06,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:09,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:44:16,259 INFO [train.py:1039] (0/4) Epoch 1, batch 3200, loss[loss=0.3807, simple_loss=0.3959, pruned_loss=0.1827, over 23171.00 frames. ], tot_loss[loss=0.3616, simple_loss=0.3809, pruned_loss=0.1712, over 4683911.64 frames. ], batch size: 105, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:44:16,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:44:16,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:44:17,054 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.01 vs. limit=15.0 2023-09-28 12:44:20,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:23,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:44:23,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 12:44:26,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:44:26,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=21333.333333333332, ans=0.0 2023-09-28 12:44:29,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=21333.333333333332, ans=0.1 2023-09-28 12:44:30,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:44:33,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:35,010 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.369e+02 3.557e+02 4.560e+02 5.822e+02 1.709e+03, threshold=9.121e+02, percent-clipped=8.0 2023-09-28 12:44:35,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=21400.0, ans=0.125 2023-09-28 12:44:42,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:44:47,657 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.65 vs. limit=15.0 2023-09-28 12:44:50,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 12:44:50,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=21466.666666666668, ans=0.07 2023-09-28 12:44:51,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:44:52,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=21466.666666666668, ans=0.0 2023-09-28 12:44:55,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 12:44:57,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:44:58,422 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.21 vs. limit=15.0 2023-09-28 12:45:00,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:45:00,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:45:02,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:45:05,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 12:45:08,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:45:10,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 12:45:15,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 12:45:16,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:45:22,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:22,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:45:22,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:24,404 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 12:45:24,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:45:29,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:45:32,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 12:45:32,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 12:45:34,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 12:45:36,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 12:45:38,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:45:39,872 INFO [train.py:1039] (0/4) Epoch 1, batch 3250, loss[loss=0.3186, simple_loss=0.351, pruned_loss=0.1431, over 24335.00 frames. ], tot_loss[loss=0.3592, simple_loss=0.38, pruned_loss=0.1692, over 4687886.43 frames. ], batch size: 56, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:45:40,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:45:40,150 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 12:45:41,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:45:41,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:45:41,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 12:45:43,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=21666.666666666668, ans=0.006159420289855073 2023-09-28 12:45:43,993 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=15.0 2023-09-28 12:45:46,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:45:49,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:45:59,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:45:59,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 12:45:59,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:00,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:00,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:00,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:02,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:46:04,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:04,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:46:06,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:06,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:11,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:13,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:14,253 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.56 vs. limit=15.0 2023-09-28 12:46:14,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:14,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:16,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:16,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:16,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=21800.0, ans=0.125 2023-09-28 12:46:17,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:22,628 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=19.97 vs. limit=22.5 2023-09-28 12:46:23,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 12:46:24,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:46:24,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:46:26,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:26,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:46:32,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:46:33,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=21866.666666666668, ans=0.1 2023-09-28 12:46:33,425 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.49 vs. limit=22.5 2023-09-28 12:46:40,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:46:41,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:41,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 12:46:41,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:46:41,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:46:41,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=21866.666666666668, ans=0.1 2023-09-28 12:46:42,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:44,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 12:46:45,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 12:46:45,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:47,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:48,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=21933.333333333332, ans=0.0 2023-09-28 12:46:49,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:49,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:46:49,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:52,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:52,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:54,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 12:46:54,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:46:57,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:46:57,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 12:47:00,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:47:00,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 12:47:02,082 INFO [train.py:1039] (0/4) Epoch 1, batch 3300, loss[loss=0.3164, simple_loss=0.3491, pruned_loss=0.1419, over 22128.00 frames. ], tot_loss[loss=0.361, simple_loss=0.3822, pruned_loss=0.1699, over 4695778.55 frames. ], batch size: 48, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:47:02,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 12:47:03,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 12:47:03,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:06,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:47:09,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:47:09,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:12,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:47:12,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:47:14,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:16,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:47:20,468 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.229e+02 3.284e+02 5.093e+02 6.809e+02 1.583e+03, threshold=1.019e+03, percent-clipped=11.0 2023-09-28 12:47:24,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 12:47:24,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=22066.666666666668, ans=0.2 2023-09-28 12:47:25,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:25,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:25,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=22066.666666666668, ans=0.0 2023-09-28 12:47:27,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:27,292 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 12:47:27,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:47:28,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:47:30,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:47:30,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:47:30,408 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 12:47:35,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:35,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:47:37,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:37,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 12:47:38,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 12:47:38,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:40,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:47:41,655 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 12:47:43,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 12:47:45,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:47:46,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 12:47:48,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:47:49,002 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.89 vs. limit=15.0 2023-09-28 12:47:51,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:47:51,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:47:51,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=22200.0, ans=0.125 2023-09-28 12:47:55,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:55,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:55,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:55,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:47:57,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:47:58,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:58,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:48:00,236 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 12:48:01,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 12:48:04,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:48:04,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:04,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:04,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=22200.0, ans=0.2 2023-09-28 12:48:07,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:48:07,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:07,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:48:08,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:08,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:48:10,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:48:11,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:48:15,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 12:48:15,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:16,317 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.61 vs. limit=22.5 2023-09-28 12:48:16,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:19,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:48:20,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:48:21,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:21,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=22266.666666666668, ans=0.125 2023-09-28 12:48:23,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:23,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:24,663 INFO [train.py:1039] (0/4) Epoch 1, batch 3350, loss[loss=0.3959, simple_loss=0.3992, pruned_loss=0.1964, over 23887.00 frames. ], tot_loss[loss=0.3618, simple_loss=0.3834, pruned_loss=0.1701, over 4703861.07 frames. ], batch size: 195, lr: 4.30e-02, grad_scale: 16.0 2023-09-28 12:48:24,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:48:26,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:29,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:48:30,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:33,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:48:35,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:36,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:48:39,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 12:48:39,317 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 12:48:40,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:43,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 12:48:43,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 12:48:46,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:48:46,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:48:48,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:48,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 12:48:48,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:48,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:48:52,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:52,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=22400.0, ans=0.006 2023-09-28 12:48:55,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:55,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:55,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:48:59,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:03,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:03,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:05,623 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.03 vs. limit=6.0 2023-09-28 12:49:08,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:49:08,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:49:08,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=22466.666666666668, ans=0.125 2023-09-28 12:49:10,221 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=22466.666666666668, ans=0.125 2023-09-28 12:49:11,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:11,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:13,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:16,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 12:49:16,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:49:17,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 12:49:17,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:49:18,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 12:49:20,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:21,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:29,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:31,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 12:49:32,103 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.99 vs. limit=15.0 2023-09-28 12:49:32,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:49:32,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:49:34,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=22600.0, ans=0.005956521739130435 2023-09-28 12:49:35,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:49:41,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:49:44,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 12:49:44,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:49:44,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:49:45,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=22600.0, ans=0.125 2023-09-28 12:49:48,229 INFO [train.py:1039] (0/4) Epoch 1, batch 3400, loss[loss=0.3376, simple_loss=0.3531, pruned_loss=0.161, over 23750.00 frames. ], tot_loss[loss=0.3618, simple_loss=0.3842, pruned_loss=0.1697, over 4705880.90 frames. ], batch size: 149, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:49:48,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:48,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 12:49:48,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:50,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 12:49:52,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:52,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:53,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:49:53,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:49:54,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 12:49:56,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=22666.666666666668, ans=0.005942028985507246 2023-09-28 12:49:59,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 12:49:59,531 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 12:49:59,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:03,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:50:03,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:50:04,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:06,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:50:06,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=22733.333333333332, ans=0.1 2023-09-28 12:50:07,749 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.237e+02 3.122e+02 3.897e+02 5.653e+02 2.230e+03, threshold=7.795e+02, percent-clipped=8.0 2023-09-28 12:50:08,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=22733.333333333332, ans=0.0 2023-09-28 12:50:11,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:12,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 12:50:17,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:50:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:19,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:21,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:50:28,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:50:29,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=22800.0, ans=0.125 2023-09-28 12:50:34,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 12:50:40,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 12:50:41,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:50:42,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:42,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:43,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:50:48,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:52,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:50:52,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:50:59,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:00,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 12:51:05,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:51:09,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 12:51:10,443 INFO [train.py:1039] (0/4) Epoch 1, batch 3450, loss[loss=0.3783, simple_loss=0.4013, pruned_loss=0.1776, over 23504.00 frames. ], tot_loss[loss=0.3599, simple_loss=0.3834, pruned_loss=0.1682, over 4717031.30 frames. ], batch size: 93, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:51:12,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 12:51:14,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:16,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:51:16,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 12:51:18,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:22,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:51:26,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:51:28,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:29,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:51:29,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:32,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:38,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 12:51:44,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 12:51:44,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:51:44,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:51:46,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:50,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 12:51:51,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:51:51,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=23133.333333333332, ans=0.125 2023-09-28 12:51:55,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=23133.333333333332, ans=0.005840579710144928 2023-09-28 12:51:56,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:51:56,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:56,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=23133.333333333332, ans=0.125 2023-09-28 12:51:58,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:51:58,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=23200.0, ans=0.125 2023-09-28 12:51:59,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:52:03,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 12:52:03,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:04,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:52:08,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:10,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 12:52:11,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:52:17,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:52:17,973 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=30.27 vs. limit=15.0 2023-09-28 12:52:18,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=23266.666666666668, ans=10.0 2023-09-28 12:52:19,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:23,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:27,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:27,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:52:29,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:52:29,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:32,290 INFO [train.py:1039] (0/4) Epoch 1, batch 3500, loss[loss=0.3441, simple_loss=0.3798, pruned_loss=0.1542, over 24632.00 frames. ], tot_loss[loss=0.358, simple_loss=0.3814, pruned_loss=0.1673, over 4718983.72 frames. ], batch size: 68, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:52:34,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:38,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:52:38,777 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.53 vs. limit=15.0 2023-09-28 12:52:39,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 12:52:41,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:52:41,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=23333.333333333332, ans=0.125 2023-09-28 12:52:45,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 12:52:48,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:48,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 12:52:51,801 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.155e+02 3.379e+02 4.182e+02 5.188e+02 1.059e+03, threshold=8.364e+02, percent-clipped=3.0 2023-09-28 12:52:55,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:52:55,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:57,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:52:57,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:52:57,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:52:59,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:59,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:52:59,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 12:53:00,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:02,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:53:03,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:07,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:08,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 12:53:08,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:53:10,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=23466.666666666668, ans=0.1 2023-09-28 12:53:11,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:14,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:53:15,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:17,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:53:17,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:20,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 12:53:20,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 12:53:20,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 12:53:20,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=23533.333333333332, ans=0.005753623188405798 2023-09-28 12:53:21,003 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=23533.333333333332, ans=0.125 2023-09-28 12:53:22,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:23,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:25,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:25,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:53:27,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=23533.333333333332, ans=0.125 2023-09-28 12:53:29,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:53:30,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:53:33,277 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.61 vs. limit=15.0 2023-09-28 12:53:35,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:53:36,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=23533.333333333332, ans=0.0 2023-09-28 12:53:37,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 12:53:37,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 12:53:37,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:53:40,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:40,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:41,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:45,344 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.86 vs. limit=15.0 2023-09-28 12:53:46,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 12:53:46,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:47,204 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.44 vs. limit=15.0 2023-09-28 12:53:48,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:50,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 12:53:51,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 12:53:53,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:55,348 INFO [train.py:1039] (0/4) Epoch 1, batch 3550, loss[loss=0.3238, simple_loss=0.3515, pruned_loss=0.1481, over 24594.00 frames. ], tot_loss[loss=0.3555, simple_loss=0.3801, pruned_loss=0.1655, over 4727484.32 frames. ], batch size: 60, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:53:55,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:55,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:53:57,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:53:57,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=23666.666666666668, ans=0.125 2023-09-28 12:54:00,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:54:06,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=23666.666666666668, ans=0.125 2023-09-28 12:54:10,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:12,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:54:15,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:16,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:54:18,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:18,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:54:18,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:54:20,586 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.48 vs. limit=15.0 2023-09-28 12:54:23,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:23,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:54:23,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:23,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:54:24,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:54:25,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=23733.333333333332, ans=0.125 2023-09-28 12:54:29,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=23800.0, ans=0.125 2023-09-28 12:54:33,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:54:33,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:34,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:34,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:36,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:54:36,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 12:54:36,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:54:38,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=23800.0, ans=0.0 2023-09-28 12:54:43,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:44,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:46,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:47,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 12:54:49,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:54:50,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 12:54:50,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:53,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:54:53,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:54:58,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 12:55:00,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:07,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:07,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 12:55:08,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:13,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:55:14,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 12:55:17,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=24000.0, ans=0.125 2023-09-28 12:55:17,949 INFO [train.py:1039] (0/4) Epoch 1, batch 3600, loss[loss=0.3146, simple_loss=0.3459, pruned_loss=0.1417, over 24342.00 frames. ], tot_loss[loss=0.3535, simple_loss=0.3781, pruned_loss=0.1645, over 4727140.65 frames. ], batch size: 56, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:55:18,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=24000.0, ans=0.125 2023-09-28 12:55:21,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 12:55:21,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:55:22,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:55:25,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:26,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:27,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:55:30,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:32,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:33,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:55:36,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:55:37,403 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.926e+02 4.483e+02 7.377e+02 1.636e+03, threshold=8.966e+02, percent-clipped=15.0 2023-09-28 12:55:37,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:37,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 12:55:40,565 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.50 vs. limit=15.0 2023-09-28 12:55:41,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:55:42,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:45,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:48,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=24066.666666666668, ans=0.125 2023-09-28 12:55:49,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:50,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:55:51,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:51,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 12:55:51,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:54,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:54,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=24133.333333333332, ans=0.125 2023-09-28 12:55:56,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:55:56,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:57,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:59,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:55:59,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 12:56:02,732 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=24133.333333333332, ans=0.0 2023-09-28 12:56:06,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:07,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:56:09,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 12:56:12,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:56:14,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=24200.0, ans=0.07 2023-09-28 12:56:17,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:21,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:27,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:56:27,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:56:27,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 12:56:29,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 12:56:31,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 12:56:34,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:56:35,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:56:36,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 12:56:36,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:56:36,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:56:36,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:38,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 12:56:39,579 INFO [train.py:1039] (0/4) Epoch 1, batch 3650, loss[loss=0.3287, simple_loss=0.378, pruned_loss=0.1397, over 24677.00 frames. ], tot_loss[loss=0.3532, simple_loss=0.3784, pruned_loss=0.1641, over 4723877.84 frames. ], batch size: 73, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:56:39,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 12:56:43,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:43,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 12:56:50,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 12:56:53,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:56:57,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 12:56:59,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 12:57:03,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:03,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:57:03,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:57:06,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:57:06,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:57:08,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 12:57:09,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:57:09,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:09,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 12:57:10,070 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=24400.0, ans=0.2 2023-09-28 12:57:11,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:57:12,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:12,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:14,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:57:18,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 12:57:19,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 12:57:19,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:57:22,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 12:57:24,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:24,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:57:30,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=24533.333333333332, ans=0.1 2023-09-28 12:57:33,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:57:35,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:35,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:57:35,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:57:35,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:57:38,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:57:40,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:40,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=24533.333333333332, ans=0.125 2023-09-28 12:57:41,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:41,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:43,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:57:43,643 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=24533.333333333332, ans=0.005536231884057972 2023-09-28 12:57:46,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:46,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:52,455 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 12:57:56,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:56,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:58,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:57:58,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:57:59,882 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.30 vs. limit=6.0 2023-09-28 12:58:00,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:58:02,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:03,220 INFO [train.py:1039] (0/4) Epoch 1, batch 3700, loss[loss=0.4959, simple_loss=0.4607, pruned_loss=0.2656, over 19509.00 frames. ], tot_loss[loss=0.3535, simple_loss=0.379, pruned_loss=0.164, over 4735225.33 frames. ], batch size: 390, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:58:04,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 12:58:04,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:07,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:58:10,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:58:10,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:58:11,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:11,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 12:58:13,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:14,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:58:14,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:58:18,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:58:22,285 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.123e+02 3.422e+02 4.027e+02 5.760e+02 1.496e+03, threshold=8.053e+02, percent-clipped=7.0 2023-09-28 12:58:22,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:58:23,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:25,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:58:25,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:25,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:58:27,255 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=24733.333333333332, ans=0.125 2023-09-28 12:58:28,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:28,569 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 12:58:39,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:58:40,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:58:41,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:58:41,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 12:58:41,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:44,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:46,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 12:58:47,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:49,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:58:50,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:52,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:58:53,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:58:55,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=24866.666666666668, ans=0.125 2023-09-28 12:58:58,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:58,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 12:58:58,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:58,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 12:59:03,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:59:03,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:59:06,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:08,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 12:59:09,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:59:09,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:59:09,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:11,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:13,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:15,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 12:59:16,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 12:59:16,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:59:18,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:19,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:59:19,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:59:22,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:59:24,357 INFO [train.py:1039] (0/4) Epoch 1, batch 3750, loss[loss=0.378, simple_loss=0.3874, pruned_loss=0.1844, over 23457.00 frames. ], tot_loss[loss=0.3553, simple_loss=0.3808, pruned_loss=0.1649, over 4723301.28 frames. ], batch size: 285, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:59:24,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:59:24,833 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=25000.0, ans=0.125 2023-09-28 12:59:25,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:59:27,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 12:59:29,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 12:59:32,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:59:32,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 12:59:33,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:59:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:37,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:40,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:59:41,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:45,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:59:46,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:59:49,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:51,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:59:53,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 12:59:53,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:59:54,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:59:54,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:59,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 13:00:03,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 13:00:03,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:00:03,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:00:05,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:13,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:13,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:00:16,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=25200.0, ans=0.005391304347826087 2023-09-28 13:00:18,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 13:00:21,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:25,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:00:25,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:00:29,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=25266.666666666668, ans=10.0 2023-09-28 13:00:30,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:00:33,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=25266.666666666668, ans=0.125 2023-09-28 13:00:33,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=25266.666666666668, ans=0.0 2023-09-28 13:00:34,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 13:00:34,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:00:36,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:00:38,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:00:39,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:00:46,952 INFO [train.py:1039] (0/4) Epoch 1, batch 3800, loss[loss=0.3282, simple_loss=0.371, pruned_loss=0.1427, over 24458.00 frames. ], tot_loss[loss=0.3538, simple_loss=0.3799, pruned_loss=0.1639, over 4721742.95 frames. ], batch size: 63, lr: 4.25e-02, grad_scale: 32.0 2023-09-28 13:00:49,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:00:52,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:52,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=25333.333333333332, ans=0.125 2023-09-28 13:00:54,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:00:55,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 13:00:55,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:57,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:00,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:01:00,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=25333.333333333332, ans=0.07 2023-09-28 13:01:00,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=25333.333333333332, ans=0.1 2023-09-28 13:01:01,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:01:01,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:01:03,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:01:05,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:01:06,434 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.982e+02 3.306e+02 4.581e+02 6.587e+02 1.016e+03, threshold=9.163e+02, percent-clipped=14.0 2023-09-28 13:01:06,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:06,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 13:01:09,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=25400.0, ans=0.1 2023-09-28 13:01:11,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:01:12,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:01:14,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:16,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=25400.0, ans=0.0 2023-09-28 13:01:17,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:01:18,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:01:19,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:01:19,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:23,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:25,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:29,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:01:30,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=25466.666666666668, ans=0.125 2023-09-28 13:01:31,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 13:01:33,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:33,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=25466.666666666668, ans=0.125 2023-09-28 13:01:39,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:01:43,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:01:45,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 13:01:49,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 13:01:49,784 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=27.85 vs. limit=22.5 2023-09-28 13:01:50,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:52,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:52,790 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.59 vs. limit=15.0 2023-09-28 13:01:54,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:55,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 13:02:00,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 13:02:00,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 13:02:00,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:01,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:02:06,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:02:07,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:02:09,328 INFO [train.py:1039] (0/4) Epoch 1, batch 3850, loss[loss=0.3765, simple_loss=0.3829, pruned_loss=0.185, over 23766.00 frames. ], tot_loss[loss=0.351, simple_loss=0.3769, pruned_loss=0.1625, over 4722501.05 frames. ], batch size: 179, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:02:12,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:02:14,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 13:02:14,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:02:14,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:18,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:02:18,702 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=25666.666666666668, ans=0.2 2023-09-28 13:02:20,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:23,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:02:23,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 13:02:33,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:33,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:36,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:36,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:02:39,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:41,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:02:43,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:43,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:02:44,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:46,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:47,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:47,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:02:49,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 13:02:49,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 13:02:49,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:50,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:54,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:02:55,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:55,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 13:02:59,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 13:02:59,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:01,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 13:03:04,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:03:08,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=25866.666666666668, ans=0.125 2023-09-28 13:03:11,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:11,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:03:16,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:16,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 13:03:17,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 13:03:20,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:20,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:22,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=25933.333333333332, ans=0.0 2023-09-28 13:03:23,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:03:23,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:03:25,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:25,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:25,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:03:27,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 13:03:27,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:03:29,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 13:03:29,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:29,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:30,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:03:31,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=26000.0, ans=0.0052173913043478265 2023-09-28 13:03:32,687 INFO [train.py:1039] (0/4) Epoch 1, batch 3900, loss[loss=0.3728, simple_loss=0.3992, pruned_loss=0.1732, over 23346.00 frames. ], tot_loss[loss=0.3488, simple_loss=0.3756, pruned_loss=0.161, over 4720842.36 frames. ], batch size: 93, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:03:32,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:32,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:03:32,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:32,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:34,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:34,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 13:03:34,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:38,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:38,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:38,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:03:41,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:44,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:44,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:46,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:03:47,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 13:03:47,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:03:49,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 13:03:50,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:51,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 13:03:52,279 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.956e+02 3.706e+02 4.742e+02 9.282e+02, threshold=7.412e+02, percent-clipped=1.0 2023-09-28 13:03:52,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 13:03:57,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:03:57,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:57,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:03:57,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:02,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:04:07,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:04:09,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:04:10,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:11,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:04:15,355 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.27 vs. limit=15.0 2023-09-28 13:04:19,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:04:20,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:04:28,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:04:28,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:04:37,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=26266.666666666668, ans=0.125 2023-09-28 13:04:40,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:04:42,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:42,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 13:04:44,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 13:04:44,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:45,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 13:04:48,781 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=26266.666666666668, ans=0.125 2023-09-28 13:04:49,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:49,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 13:04:55,902 INFO [train.py:1039] (0/4) Epoch 1, batch 3950, loss[loss=0.3153, simple_loss=0.3583, pruned_loss=0.1362, over 24320.00 frames. ], tot_loss[loss=0.3492, simple_loss=0.3755, pruned_loss=0.1614, over 4718925.89 frames. ], batch size: 61, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:04:57,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:05:00,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 13:05:00,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:05:02,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:05:03,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:05:09,887 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 13:05:09,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:10,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 13:05:12,023 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 13:05:12,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:05:16,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:16,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:05:16,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:19,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 13:05:21,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:05:22,779 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.02 vs. limit=15.0 2023-09-28 13:05:23,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:23,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:05:24,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:05:25,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:05:25,599 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=26400.0, ans=0.125 2023-09-28 13:05:28,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=26466.666666666668, ans=0.125 2023-09-28 13:05:30,226 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=26466.666666666668, ans=0.125 2023-09-28 13:05:37,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:05:37,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:05:42,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 13:05:44,837 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.29 vs. limit=15.0 2023-09-28 13:05:47,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 13:05:47,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 13:05:47,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:05:48,878 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.65 vs. limit=22.5 2023-09-28 13:05:49,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:05:55,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=26533.333333333332, ans=0.0 2023-09-28 13:05:58,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:06:00,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:06:00,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:00,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:06:01,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 13:06:06,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:06:06,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:06:08,527 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=26600.0, ans=0.04949747468305833 2023-09-28 13:06:11,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 13:06:11,570 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:06:16,356 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-4000.pt 2023-09-28 13:06:20,838 INFO [train.py:1039] (0/4) Epoch 1, batch 4000, loss[loss=0.3675, simple_loss=0.3791, pruned_loss=0.1779, over 23739.00 frames. ], tot_loss[loss=0.3484, simple_loss=0.375, pruned_loss=0.1609, over 4715053.78 frames. ], batch size: 232, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:06:26,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:31,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:38,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:38,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:06:40,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:40,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 13:06:40,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:06:41,653 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.100e+02 3.230e+02 4.293e+02 5.644e+02 1.126e+03, threshold=8.585e+02, percent-clipped=11.0 2023-09-28 13:06:41,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 13:06:41,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:06:41,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 13:06:43,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:47,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:06:47,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:06:47,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:06:47,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:47,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:06:49,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:06:49,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=26733.333333333332, ans=0.125 2023-09-28 13:06:50,939 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 13:06:52,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:06:52,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:06:55,656 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 13:06:57,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:06:57,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:05,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 13:07:05,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:07:05,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=26800.0, ans=0.125 2023-09-28 13:07:07,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:07:09,134 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 13:07:09,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:07:10,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 13:07:10,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:07:10,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:12,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:07:14,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:07:15,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:07:15,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:17,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 13:07:18,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:20,039 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 13:07:23,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=26866.666666666668, ans=0.125 2023-09-28 13:07:24,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:07:28,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 13:07:29,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:07:30,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=26933.333333333332, ans=0.0 2023-09-28 13:07:30,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:31,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:33,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:07:33,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:07:38,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:39,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:41,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:07:41,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 13:07:43,730 INFO [train.py:1039] (0/4) Epoch 1, batch 4050, loss[loss=0.311, simple_loss=0.345, pruned_loss=0.1385, over 24434.00 frames. ], tot_loss[loss=0.35, simple_loss=0.3757, pruned_loss=0.1621, over 4699941.15 frames. ], batch size: 58, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:07:45,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:07:45,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:07:47,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:07:48,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:07:48,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=27000.0, ans=0.125 2023-09-28 13:07:49,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:52,580 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.50 vs. limit=6.0 2023-09-28 13:07:53,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:54,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:07:55,477 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.56 vs. limit=15.0 2023-09-28 13:07:56,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:07:57,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:07:59,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:08:02,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:04,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:08:07,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 13:08:09,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 13:08:09,162 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 13:08:12,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:08:21,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 13:08:21,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=27133.333333333332, ans=0.0 2023-09-28 13:08:22,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:26,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:29,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:29,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:08:29,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:32,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:08:35,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 13:08:35,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:08:37,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:39,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 13:08:40,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=27200.0, ans=0.0 2023-09-28 13:08:43,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:44,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=27200.0, ans=0.004956521739130435 2023-09-28 13:08:46,034 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.75 vs. limit=22.5 2023-09-28 13:08:53,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 13:08:53,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:53,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:08:56,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 13:08:56,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 13:08:56,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:08:56,564 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=27266.666666666668, ans=0.2 2023-09-28 13:08:57,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:08:59,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:08:59,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:09:05,398 INFO [train.py:1039] (0/4) Epoch 1, batch 4100, loss[loss=0.3568, simple_loss=0.3657, pruned_loss=0.174, over 22758.00 frames. ], tot_loss[loss=0.3489, simple_loss=0.3753, pruned_loss=0.1613, over 4716257.62 frames. ], batch size: 322, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:09:08,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 13:09:10,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 13:09:11,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=27333.333333333332, ans=0.125 2023-09-28 13:09:14,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 13:09:14,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 13:09:14,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:16,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:16,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:18,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:09:18,241 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 13:09:19,103 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.65 vs. limit=15.0 2023-09-28 13:09:22,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:24,289 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.870e+02 3.486e+02 4.089e+02 6.314e+02, threshold=6.972e+02, percent-clipped=0.0 2023-09-28 13:09:24,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:09:24,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:24,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=27400.0, ans=0.125 2023-09-28 13:09:25,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:09:32,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:09:33,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:33,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:09:33,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 13:09:35,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:35,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:09:35,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:35,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:09:36,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 13:09:38,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:09:38,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 13:09:38,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=27466.666666666668, ans=0.0 2023-09-28 13:09:40,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:09:43,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:43,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 13:09:46,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:09:46,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:09:47,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:09:49,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 13:09:49,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:09:50,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:09:55,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 13:09:56,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:57,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:09:59,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=27533.333333333332, ans=0.1 2023-09-28 13:10:01,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:08,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:11,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:12,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:10:17,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:17,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:20,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:22,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=27600.0, ans=0.125 2023-09-28 13:10:24,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:10:24,894 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=17.84 vs. limit=15.0 2023-09-28 13:10:27,471 INFO [train.py:1039] (0/4) Epoch 1, batch 4150, loss[loss=0.3404, simple_loss=0.3764, pruned_loss=0.1523, over 24500.00 frames. ], tot_loss[loss=0.3488, simple_loss=0.3755, pruned_loss=0.161, over 4703127.34 frames. ], batch size: 66, lr: 4.21e-02, grad_scale: 32.0 2023-09-28 13:10:29,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:10:30,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:10:32,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:10:32,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:35,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 13:10:35,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:35,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 13:10:37,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 13:10:37,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 13:10:38,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:44,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:10:44,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:47,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:10:47,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:10:47,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=27733.333333333332, ans=0.1 2023-09-28 13:10:49,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:10:50,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:10:51,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:53,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:10:56,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=27733.333333333332, ans=0.125 2023-09-28 13:10:58,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:03,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:03,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 13:11:06,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 13:11:06,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:11:07,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 13:11:07,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:11:07,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:12,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:13,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:16,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 13:11:19,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:22,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:11:22,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 13:11:24,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:25,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 13:11:27,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:11:28,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:30,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:30,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 13:11:30,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:11:30,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:11:32,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:11:35,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 13:11:35,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:37,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:11:37,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:11:38,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 13:11:40,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:40,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:11:40,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:43,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:43,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 13:11:43,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:49,440 INFO [train.py:1039] (0/4) Epoch 1, batch 4200, loss[loss=0.3558, simple_loss=0.3769, pruned_loss=0.1674, over 23449.00 frames. ], tot_loss[loss=0.3472, simple_loss=0.3741, pruned_loss=0.1602, over 4695623.04 frames. ], batch size: 119, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:11:49,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:11:51,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 13:11:52,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:11:54,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:11:54,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:11:56,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:56,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:58,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 13:12:00,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=28000.0, ans=0.0 2023-09-28 13:12:01,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 13:12:02,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:02,895 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.93 vs. limit=12.0 2023-09-28 13:12:05,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:08,506 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.106e+02 3.443e+02 4.074e+02 5.504e+02 1.074e+03, threshold=8.148e+02, percent-clipped=10.0 2023-09-28 13:12:08,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:12:08,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=28066.666666666668, ans=0.0 2023-09-28 13:12:13,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:12:15,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:15,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:16,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 13:12:16,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:18,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:18,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:12:18,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:12:22,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:12:23,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 13:12:23,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:28,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:12:28,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:12:33,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:12:33,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:12:34,216 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.53 vs. limit=22.5 2023-09-28 13:12:34,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:12:35,142 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=28133.333333333332, ans=0.1 2023-09-28 13:12:36,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 13:12:36,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:12:38,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:12:41,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=28200.0, ans=0.125 2023-09-28 13:12:44,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:12:46,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:46,532 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=28200.0, ans=0.004739130434782609 2023-09-28 13:12:51,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=28200.0, ans=0.1 2023-09-28 13:12:52,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:12:55,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 13:12:58,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:03,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:13:03,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:06,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 13:13:06,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=28266.666666666668, ans=0.125 2023-09-28 13:13:06,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=28266.666666666668, ans=0.125 2023-09-28 13:13:11,063 INFO [train.py:1039] (0/4) Epoch 1, batch 4250, loss[loss=0.347, simple_loss=0.3657, pruned_loss=0.1641, over 23899.00 frames. ], tot_loss[loss=0.3458, simple_loss=0.3728, pruned_loss=0.1594, over 4697847.50 frames. ], batch size: 195, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:13:11,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:13:15,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:13:17,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:13:20,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:25,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:13:25,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 13:13:27,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:13:28,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=28400.0, ans=0.125 2023-09-28 13:13:29,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:30,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.03 vs. limit=15.0 2023-09-28 13:13:33,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:37,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:38,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:40,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:13:41,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:13:41,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:43,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:43,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:46,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:13:48,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:49,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 13:13:50,084 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:13:51,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 13:13:51,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:52,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:53,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:53,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:13:53,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:55,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:58,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:14:00,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:14:04,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:05,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:07,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 13:14:07,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:14:07,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 13:14:07,738 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=28533.333333333332, ans=0.2 2023-09-28 13:14:10,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:14:12,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:14:13,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:13,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:14:16,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 13:14:18,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:14:19,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:14:22,108 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.58 vs. limit=15.0 2023-09-28 13:14:23,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:26,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:28,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:14:28,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:31,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:32,635 INFO [train.py:1039] (0/4) Epoch 1, batch 4300, loss[loss=0.3449, simple_loss=0.3655, pruned_loss=0.1622, over 23794.00 frames. ], tot_loss[loss=0.3432, simple_loss=0.3709, pruned_loss=0.1578, over 4709629.65 frames. ], batch size: 179, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:14:32,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:14:32,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:14:34,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 13:14:35,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=28666.666666666668, ans=0.05 2023-09-28 13:14:37,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:39,764 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.11 vs. limit=15.0 2023-09-28 13:14:42,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:43,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:14:46,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:52,610 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.815e+02 2.945e+02 3.341e+02 4.373e+02 7.931e+02, threshold=6.681e+02, percent-clipped=0.0 2023-09-28 13:14:53,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:53,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 13:14:54,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:14:57,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=28733.333333333332, ans=0.125 2023-09-28 13:14:59,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:14:59,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:14:59,609 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 13:15:02,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:15:04,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:07,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 13:15:08,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:15:08,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 13:15:09,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=28800.0, ans=0.2 2023-09-28 13:15:11,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:15:12,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:15:15,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:15:15,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:15:16,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:15:18,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:19,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:15:19,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 13:15:19,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 13:15:22,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:15:25,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:25,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:15:27,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:27,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:27,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 13:15:27,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 13:15:29,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 13:15:29,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:15:29,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 13:15:29,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 13:15:34,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:36,032 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 13:15:37,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:15:39,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:39,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:40,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 13:15:42,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:42,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:43,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=28933.333333333332, ans=0.1 2023-09-28 13:15:44,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:15:44,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:44,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:15:48,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:15:49,919 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=28933.333333333332, ans=0.0 2023-09-28 13:15:51,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:52,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:52,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:55,548 INFO [train.py:1039] (0/4) Epoch 1, batch 4350, loss[loss=0.3596, simple_loss=0.3755, pruned_loss=0.1719, over 23694.00 frames. ], tot_loss[loss=0.3438, simple_loss=0.3722, pruned_loss=0.1577, over 4718927.58 frames. ], batch size: 232, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:15:57,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 13:15:58,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:16:02,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:05,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:07,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:16:07,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:16:07,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=29000.0, ans=0.1 2023-09-28 13:16:12,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:16:12,499 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=29066.666666666668, ans=0.125 2023-09-28 13:16:17,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:18,221 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=29066.666666666668, ans=0.125 2023-09-28 13:16:19,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:16:19,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:16:24,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:16:27,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:16:28,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:16:29,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=29133.333333333332, ans=0.1 2023-09-28 13:16:29,290 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:16:32,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=29133.333333333332, ans=0.0 2023-09-28 13:16:33,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 13:16:33,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:35,701 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.03 vs. limit=15.0 2023-09-28 13:16:37,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:41,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:44,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 13:16:48,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:16:50,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:16:55,283 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 13:16:56,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:16:58,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:17:00,431 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 13:17:00,531 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 13:17:00,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:00,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=29266.666666666668, ans=0.125 2023-09-28 13:17:01,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:17:02,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:03,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:03,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:04,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=29266.666666666668, ans=0.2 2023-09-28 13:17:05,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 13:17:05,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:05,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:05,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:06,412 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.70 vs. limit=10.0 2023-09-28 13:17:06,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 13:17:07,114 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 13:17:07,124 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 13:17:07,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 13:17:10,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:17:10,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:17:12,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:12,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:17:15,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 13:17:18,356 INFO [train.py:1039] (0/4) Epoch 1, batch 4400, loss[loss=0.3156, simple_loss=0.3539, pruned_loss=0.1386, over 24491.00 frames. ], tot_loss[loss=0.3451, simple_loss=0.3739, pruned_loss=0.1582, over 4706250.01 frames. ], batch size: 63, lr: 4.18e-02, grad_scale: 32.0 2023-09-28 13:17:18,466 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 13:17:18,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:25,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:25,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:26,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:28,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 13:17:30,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 13:17:30,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 13:17:30,347 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 13:17:30,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=29333.333333333332, ans=10.0 2023-09-28 13:17:31,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:17:31,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:34,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 13:17:35,717 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=29400.0, ans=0.125 2023-09-28 13:17:36,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:37,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:38,339 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.230e+02 3.318e+02 4.079e+02 5.261e+02 1.011e+03, threshold=8.157e+02, percent-clipped=12.0 2023-09-28 13:17:38,448 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 13:17:41,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:41,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 13:17:41,681 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 13:17:44,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 13:17:44,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 13:17:45,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 13:17:45,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:47,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:48,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:50,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:51,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 13:17:51,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 13:17:53,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:54,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:17:54,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:56,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:56,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:56,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 13:17:58,692 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 13:18:03,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:09,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:18:12,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 13:18:15,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:18:18,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:20,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:18:20,832 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=29533.333333333332, ans=0.0 2023-09-28 13:18:21,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 13:18:21,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:18:22,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:18:22,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:18:23,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:18:28,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 13:18:29,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 13:18:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 13:18:31,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:18:31,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 13:18:31,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:18:35,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:18:38,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 13:18:41,498 INFO [train.py:1039] (0/4) Epoch 1, batch 4450, loss[loss=0.3625, simple_loss=0.3939, pruned_loss=0.1656, over 24668.00 frames. ], tot_loss[loss=0.347, simple_loss=0.3755, pruned_loss=0.1592, over 4701042.16 frames. ], batch size: 65, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:18:41,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:45,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:45,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:18:50,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=29666.666666666668, ans=0.125 2023-09-28 13:18:52,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:18:52,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:18:56,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:58,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:19:00,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:19:00,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=29733.333333333332, ans=0.125 2023-09-28 13:19:01,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:01,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=29733.333333333332, ans=0.2 2023-09-28 13:19:03,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 13:19:03,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:05,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:05,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:05,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:19:08,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:19:08,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=29733.333333333332, ans=0.1 2023-09-28 13:19:14,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:14,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:16,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:16,515 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=29800.0, ans=0.0 2023-09-28 13:19:17,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:19,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:19:22,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:19:24,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 13:19:25,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 13:19:25,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:19:25,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=29800.0, ans=0.125 2023-09-28 13:19:25,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=29800.0, ans=0.0 2023-09-28 13:19:26,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:28,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 13:19:31,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:19:31,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=29866.666666666668, ans=0.0 2023-09-28 13:19:37,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:37,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 13:19:37,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:37,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:37,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:19:37,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:38,146 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.92 vs. limit=15.0 2023-09-28 13:19:40,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:44,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:19:44,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 13:19:44,766 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.74 vs. limit=15.0 2023-09-28 13:19:46,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:19:49,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:50,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:52,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:53,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:19:55,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:19:59,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 13:19:59,665 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=19.97 vs. limit=22.5 2023-09-28 13:20:01,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:20:03,928 INFO [train.py:1039] (0/4) Epoch 1, batch 4500, loss[loss=0.3537, simple_loss=0.3702, pruned_loss=0.1686, over 23445.00 frames. ], tot_loss[loss=0.3454, simple_loss=0.3747, pruned_loss=0.1581, over 4712522.32 frames. ], batch size: 134, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:20:07,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:08,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 13:20:08,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 13:20:10,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:12,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=30000.0, ans=0.125 2023-09-28 13:20:15,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:20:17,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:17,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:20:18,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:20:18,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:19,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:23,487 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.079e+02 2.950e+02 3.479e+02 4.293e+02 6.506e+02, threshold=6.959e+02, percent-clipped=0.0 2023-09-28 13:20:33,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:33,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:20:37,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:20:39,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:20:39,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:20:45,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:20:49,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:20:53,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:20:57,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:20:57,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 13:20:57,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:20:59,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:02,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:02,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:21:05,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:21:05,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 13:21:05,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:21:05,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:10,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:21:10,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:21:10,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=30266.666666666668, ans=0.004289855072463768 2023-09-28 13:21:14,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:15,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:21:15,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:21:17,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 13:21:19,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=30266.666666666668, ans=0.0 2023-09-28 13:21:20,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 13:21:20,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 13:21:22,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 13:21:26,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 13:21:27,404 INFO [train.py:1039] (0/4) Epoch 1, batch 4550, loss[loss=0.3385, simple_loss=0.3869, pruned_loss=0.1451, over 24320.00 frames. ], tot_loss[loss=0.3452, simple_loss=0.3738, pruned_loss=0.1583, over 4706893.32 frames. ], batch size: 77, lr: 4.16e-02, grad_scale: 32.0 2023-09-28 13:21:27,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:32,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:32,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:37,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:40,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:21:42,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:44,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:21:44,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:21:44,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:46,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:48,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:50,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=30400.0, ans=0.004260869565217392 2023-09-28 13:21:51,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:21:53,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 13:21:55,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 13:21:55,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:21:56,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 13:22:00,002 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=30466.666666666668, ans=0.125 2023-09-28 13:22:01,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 13:22:02,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:07,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 13:22:09,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:22:09,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=30466.666666666668, ans=0.05 2023-09-28 13:22:13,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:13,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:13,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:22:16,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 13:22:19,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:20,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:20,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:21,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=30533.333333333332, ans=0.125 2023-09-28 13:22:22,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:24,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 13:22:24,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 13:22:24,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:22:26,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 13:22:26,972 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=30533.333333333332, ans=0.0 2023-09-28 13:22:28,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 13:22:28,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:28,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:29,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:31,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:31,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:22:31,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:22:32,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 13:22:35,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:35,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:22:35,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 13:22:35,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:22:37,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 13:22:40,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:22:40,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:22:42,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:22:44,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:44,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:22:45,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:22:49,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:22:51,245 INFO [train.py:1039] (0/4) Epoch 1, batch 4600, loss[loss=0.3659, simple_loss=0.3752, pruned_loss=0.1783, over 23438.00 frames. ], tot_loss[loss=0.3426, simple_loss=0.3721, pruned_loss=0.1566, over 4716367.96 frames. ], batch size: 285, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:22:52,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:54,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:54,725 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:22:56,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:22:58,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:22:58,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:22:59,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 13:23:02,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:23:05,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:23:06,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:07,698 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.84 vs. limit=22.5 2023-09-28 13:23:08,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:11,501 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.016e+02 3.238e+02 3.793e+02 5.269e+02 1.285e+03, threshold=7.587e+02, percent-clipped=5.0 2023-09-28 13:23:15,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 13:23:17,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:20,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:23,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=30800.0, ans=0.125 2023-09-28 13:23:25,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:23:25,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:31,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 13:23:31,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:23:32,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:23:37,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:39,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:23:40,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:23:44,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 13:23:44,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:23:46,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=30866.666666666668, ans=0.125 2023-09-28 13:23:49,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:49,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:23:51,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:51,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 13:23:52,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:53,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 13:23:53,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:55,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:56,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:57,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:57,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:57,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 13:23:59,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 13:24:00,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 13:24:00,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:02,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:02,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:02,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=30933.333333333332, ans=0.2 2023-09-28 13:24:04,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:24:04,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=30933.333333333332, ans=0.125 2023-09-28 13:24:08,527 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=30933.333333333332, ans=0.125 2023-09-28 13:24:14,391 INFO [train.py:1039] (0/4) Epoch 1, batch 4650, loss[loss=0.3373, simple_loss=0.361, pruned_loss=0.1568, over 23684.00 frames. ], tot_loss[loss=0.3408, simple_loss=0.3708, pruned_loss=0.1553, over 4726901.71 frames. ], batch size: 232, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:24:14,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:24:18,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:18,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:19,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:24:19,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:19,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:21,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:24,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 13:24:27,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:24:27,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=31000.0, ans=0.125 2023-09-28 13:24:28,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 13:24:28,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:29,395 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.64 vs. limit=15.0 2023-09-28 13:24:30,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 13:24:31,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:24:32,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 13:24:32,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 13:24:32,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:33,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:24:38,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:24:38,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:39,024 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 13:24:44,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:45,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 13:24:48,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:48,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:24:50,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 13:24:51,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:24:55,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:24:55,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=31133.333333333332, ans=0.95 2023-09-28 13:24:58,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:03,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:06,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:07,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:09,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:25:11,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 13:25:12,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 13:25:14,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 13:25:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 13:25:15,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:23,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:25:23,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:23,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 13:25:23,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:24,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:24,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:25:26,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:25:26,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=31266.666666666668, ans=0.004072463768115942 2023-09-28 13:25:28,527 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.82 vs. limit=15.0 2023-09-28 13:25:30,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:25:30,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:30,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:33,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:34,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:25:34,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:25:34,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 13:25:35,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:25:36,395 INFO [train.py:1039] (0/4) Epoch 1, batch 4700, loss[loss=0.3204, simple_loss=0.3681, pruned_loss=0.1364, over 24464.00 frames. ], tot_loss[loss=0.3407, simple_loss=0.3711, pruned_loss=0.1551, over 4727166.84 frames. ], batch size: 63, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:25:36,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 13:25:41,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=31333.333333333332, ans=0.0 2023-09-28 13:25:45,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:45,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:47,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:25:48,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:50,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:25:51,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=31333.333333333332, ans=0.125 2023-09-28 13:25:55,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 13:25:56,847 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.950e+02 3.070e+02 3.636e+02 4.699e+02 2.301e+03, threshold=7.272e+02, percent-clipped=9.0 2023-09-28 13:25:56,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 13:26:00,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:02,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:26:02,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:26:02,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=31400.0, ans=0.1 2023-09-28 13:26:05,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:12,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:26:14,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:26:17,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:26:23,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 13:26:24,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:26:26,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:30,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 13:26:31,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:26:36,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:26:36,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 13:26:39,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:39,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:40,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=31533.333333333332, ans=0.05 2023-09-28 13:26:41,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:41,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:26:41,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 13:26:43,259 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 13:26:43,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:46,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 13:26:49,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:49,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=31600.0, ans=0.2 2023-09-28 13:26:52,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 13:26:55,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:26:56,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:26:59,828 INFO [train.py:1039] (0/4) Epoch 1, batch 4750, loss[loss=0.3049, simple_loss=0.3483, pruned_loss=0.1307, over 24671.00 frames. ], tot_loss[loss=0.3409, simple_loss=0.3714, pruned_loss=0.1552, over 4731068.61 frames. ], batch size: 65, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:27:03,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:03,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:27:04,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 13:27:04,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:08,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 13:27:10,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:27:10,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:27:11,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:16,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 13:27:19,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:27:22,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 13:27:23,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:25,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=31733.333333333332, ans=0.2 2023-09-28 13:27:26,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:26,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:26,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:27,099 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 13:27:27,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 13:27:33,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 13:27:36,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:39,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:27:40,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:27:40,783 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 13:27:40,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:27:44,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:27:46,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:27:49,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 13:27:49,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 13:27:49,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:49,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:27:50,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:52,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:27:52,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 13:27:53,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 13:27:56,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:27:57,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.99 vs. limit=22.5 2023-09-28 13:28:00,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:28:00,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 13:28:00,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:00,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=31866.666666666668, ans=0.125 2023-09-28 13:28:02,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:04,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:28:04,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:04,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=31933.333333333332, ans=0.125 2023-09-28 13:28:05,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:28:10,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:10,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 13:28:11,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 13:28:11,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 13:28:15,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:28:15,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:16,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 13:28:21,636 INFO [train.py:1039] (0/4) Epoch 1, batch 4800, loss[loss=0.3849, simple_loss=0.3993, pruned_loss=0.1853, over 22758.00 frames. ], tot_loss[loss=0.3437, simple_loss=0.3737, pruned_loss=0.1568, over 4726729.06 frames. ], batch size: 322, lr: 4.13e-02, grad_scale: 32.0 2023-09-28 13:28:23,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:23,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:26,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=32000.0, ans=0.0 2023-09-28 13:28:29,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:28:30,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:30,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:31,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 13:28:32,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:32,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:28:36,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:28:38,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=32066.666666666668, ans=0.2 2023-09-28 13:28:41,500 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.966e+02 2.727e+02 3.187e+02 3.824e+02 7.207e+02, threshold=6.374e+02, percent-clipped=0.0 2023-09-28 13:28:41,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:28:43,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:43,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:28:44,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:44,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:28:44,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:46,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:49,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:54,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:54,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:55,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:28:57,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:28:59,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:00,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 13:29:02,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 13:29:02,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:02,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:29:03,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:29:03,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:03,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:29:06,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:29:06,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:10,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:16,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:19,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:23,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 13:29:23,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:23,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:23,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:29:25,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:26,080 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=32266.666666666668, ans=0.125 2023-09-28 13:29:27,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:29,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:29:29,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:30,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:29:30,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:29:32,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:29:36,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:36,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:36,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:37,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 13:29:40,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 13:29:40,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:29:40,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:43,144 INFO [train.py:1039] (0/4) Epoch 1, batch 4850, loss[loss=0.3247, simple_loss=0.355, pruned_loss=0.1472, over 18975.00 frames. ], tot_loss[loss=0.3413, simple_loss=0.3723, pruned_loss=0.1551, over 4727927.85 frames. ], batch size: 41, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:29:43,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:53,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 13:29:55,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:30:01,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:02,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:30:02,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:06,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:30:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:30:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:30:07,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 13:30:07,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=32400.0, ans=0.1 2023-09-28 13:30:12,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:30:15,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:30:15,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:30:15,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:30:15,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 13:30:18,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:18,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 13:30:26,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 13:30:27,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:30:36,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:30:36,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 13:30:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:30:37,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:30:39,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:30:39,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 13:30:39,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:43,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 13:30:43,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:44,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:30:45,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 13:30:54,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:54,717 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=32600.0, ans=0.0037826086956521737 2023-09-28 13:31:00,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:31:00,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:05,048 INFO [train.py:1039] (0/4) Epoch 1, batch 4900, loss[loss=0.3018, simple_loss=0.3517, pruned_loss=0.126, over 24489.00 frames. ], tot_loss[loss=0.3395, simple_loss=0.3705, pruned_loss=0.1543, over 4722814.99 frames. ], batch size: 63, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:31:05,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 13:31:05,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:31:12,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:12,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:12,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:31:17,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 13:31:22,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 13:31:23,885 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.271e+02 3.049e+02 3.510e+02 4.577e+02 9.864e+02, threshold=7.020e+02, percent-clipped=4.0 2023-09-28 13:31:25,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 13:31:27,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 13:31:27,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:29,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:29,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:31:29,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:29,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:31:31,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 13:31:36,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 13:31:37,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:31:39,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:31:39,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:41,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:31:42,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:44,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:31:44,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 13:31:47,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:31:48,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:48,836 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=32800.0, ans=0.1 2023-09-28 13:31:50,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 13:31:50,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 13:31:53,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 13:31:55,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:31:55,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:31:57,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:31:57,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:57,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:31:57,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:31:58,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 13:32:02,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:03,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:32:06,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:32:09,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 13:32:09,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:32:10,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 13:32:10,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 13:32:18,319 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=15.0 2023-09-28 13:32:18,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:20,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:32:21,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 13:32:23,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:23,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:32:23,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:23,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=32933.333333333336, ans=0.1 2023-09-28 13:32:27,204 INFO [train.py:1039] (0/4) Epoch 1, batch 4950, loss[loss=0.3189, simple_loss=0.3704, pruned_loss=0.1337, over 24409.00 frames. ], tot_loss[loss=0.3376, simple_loss=0.369, pruned_loss=0.1532, over 4735768.55 frames. ], batch size: 77, lr: 4.11e-02, grad_scale: 32.0 2023-09-28 13:32:28,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:28,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:32:28,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:28,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:32:30,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:32:33,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:35,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:37,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 13:32:38,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 13:32:38,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:32:38,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=33000.0, ans=0.2 2023-09-28 13:32:40,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 13:32:40,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:40,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:32:40,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:32:40,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:32:41,257 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=33000.0, ans=0.125 2023-09-28 13:32:42,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:43,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:32:45,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:32:46,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:47,287 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:32:49,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:49,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:53,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:32:58,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:00,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:33:02,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:02,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:03,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:33:05,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 13:33:05,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 13:33:05,823 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:33:09,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:10,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:33:12,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:33:12,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:33:12,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:33:13,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=33133.333333333336, ans=0.0 2023-09-28 13:33:14,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:33:16,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:18,512 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.79 vs. limit=22.5 2023-09-28 13:33:19,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:33:20,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:33:22,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:22,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=33200.0, ans=0.1 2023-09-28 13:33:23,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:23,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 13:33:25,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:33:25,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:33:29,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=33200.0, ans=0.1 2023-09-28 13:33:30,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:33:32,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:33:32,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:33:32,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:34,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:33:34,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:33:36,165 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.99 vs. limit=10.0 2023-09-28 13:33:37,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:33:37,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:33:38,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:38,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 13:33:42,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:33:49,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 13:33:50,629 INFO [train.py:1039] (0/4) Epoch 1, batch 5000, loss[loss=0.3449, simple_loss=0.3664, pruned_loss=0.1617, over 23813.00 frames. ], tot_loss[loss=0.3367, simple_loss=0.3687, pruned_loss=0.1524, over 4741391.55 frames. ], batch size: 212, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:33:50,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:33:54,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=33333.333333333336, ans=0.003623188405797101 2023-09-28 13:33:57,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:57,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:33:58,089 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.85 vs. limit=6.0 2023-09-28 13:33:59,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 13:34:02,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 13:34:04,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:04,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 13:34:05,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:34:05,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:34:07,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 13:34:07,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:07,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:08,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=33400.0, ans=0.125 2023-09-28 13:34:09,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 13:34:09,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:09,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:10,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 13:34:12,241 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.427e+02 3.057e+02 3.609e+02 4.472e+02 7.216e+02, threshold=7.218e+02, percent-clipped=2.0 2023-09-28 13:34:12,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 13:34:12,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:34:12,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 13:34:12,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:34:13,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:15,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:34:15,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 13:34:15,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 13:34:17,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 13:34:17,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:18,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:19,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 13:34:19,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:34:20,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:22,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:24,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:34:25,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 13:34:26,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:34:27,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:34:30,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=33466.666666666664, ans=0.125 2023-09-28 13:34:32,103 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 13:34:34,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=33466.666666666664, ans=0.125 2023-09-28 13:34:36,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:36,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=33466.666666666664, ans=0.1 2023-09-28 13:34:37,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:37,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:34:42,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 13:34:42,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:43,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:43,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:34:45,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 13:34:45,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:49,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:50,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:55,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 13:35:00,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:11,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:35:12,053 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.03 vs. limit=22.5 2023-09-28 13:35:12,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:12,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:35:12,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:12,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:35:12,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:35:13,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=33666.666666666664, ans=0.2 2023-09-28 13:35:14,085 INFO [train.py:1039] (0/4) Epoch 1, batch 5050, loss[loss=0.3616, simple_loss=0.38, pruned_loss=0.1716, over 23644.00 frames. ], tot_loss[loss=0.337, simple_loss=0.369, pruned_loss=0.1525, over 4743244.06 frames. ], batch size: 256, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:35:14,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:19,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:19,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 13:35:19,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:35:22,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:26,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:35:26,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 13:35:27,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:27,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:35:30,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:35:32,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:35:32,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:35:32,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=33733.333333333336, ans=0.125 2023-09-28 13:35:42,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 13:35:44,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:35:44,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=33733.333333333336, ans=0.125 2023-09-28 13:35:46,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:35:46,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 13:35:47,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:35:48,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:49,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:49,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:35:49,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 13:35:51,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 13:35:52,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:54,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:35:57,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:57,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 13:36:00,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:04,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 13:36:05,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:36:05,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:36:07,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:07,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:36:09,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:12,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:36:12,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:12,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:36:14,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:36:14,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 13:36:15,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:36:18,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:36:23,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:23,085 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 13:36:23,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:36:24,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:24,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:25,584 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.11 vs. limit=22.5 2023-09-28 13:36:26,119 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 13:36:29,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:30,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 13:36:30,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:30,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=33933.333333333336, ans=0.125 2023-09-28 13:36:34,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:34,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:34,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 13:36:36,982 INFO [train.py:1039] (0/4) Epoch 1, batch 5100, loss[loss=0.3213, simple_loss=0.3687, pruned_loss=0.1369, over 24283.00 frames. ], tot_loss[loss=0.3369, simple_loss=0.3695, pruned_loss=0.1521, over 4739049.23 frames. ], batch size: 61, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:36:37,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 13:36:37,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:37,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:36:39,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:36:42,559 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 13:36:46,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:47,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 13:36:49,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 13:36:49,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:49,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:52,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:54,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 13:36:54,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 13:36:55,090 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.96 vs. limit=15.0 2023-09-28 13:36:59,955 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 3.081e+02 3.841e+02 4.720e+02 7.459e+02, threshold=7.682e+02, percent-clipped=1.0 2023-09-28 13:37:00,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:37:01,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:37:06,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:37:09,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 13:37:09,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:13,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:37:13,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:37:17,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 13:37:20,079 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 13:37:21,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:21,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 13:37:21,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 13:37:26,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:32,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:37:35,243 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.58 vs. limit=15.0 2023-09-28 13:37:36,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 13:37:38,242 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 13:37:38,266 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 13:37:41,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 13:37:41,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:42,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 13:37:46,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 13:37:49,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:37:51,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:37:53,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 13:37:55,531 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=34266.666666666664, ans=0.125 2023-09-28 13:37:56,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:37:56,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 13:38:01,385 INFO [train.py:1039] (0/4) Epoch 1, batch 5150, loss[loss=0.3256, simple_loss=0.3786, pruned_loss=0.1363, over 24299.00 frames. ], tot_loss[loss=0.338, simple_loss=0.3704, pruned_loss=0.1528, over 4743186.39 frames. ], batch size: 74, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:38:03,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:38:03,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:38:03,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:38:04,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:38:04,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:38:06,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:38:06,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 13:38:06,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 13:38:07,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 13:38:07,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:38:07,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 13:38:11,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:11,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:38:13,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:14,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:19,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:38:19,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 13:38:19,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:19,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:38:20,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=34400.0, ans=0.0 2023-09-28 13:38:23,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:38:23,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:23,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:25,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:38:25,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:38:25,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 13:38:26,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:38:26,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:38:29,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:38:32,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 13:38:33,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:38:34,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=34466.666666666664, ans=0.125 2023-09-28 13:38:35,499 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=34466.666666666664, ans=0.125 2023-09-28 13:38:39,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:38:43,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 13:38:46,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:50,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=34533.333333333336, ans=0.1 2023-09-28 13:38:54,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:55,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:39:00,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:00,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:02,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 13:39:04,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=34533.333333333336, ans=0.125 2023-09-28 13:39:07,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:39:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:39:08,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:39:12,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:12,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:13,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 13:39:20,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:39:21,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:39:24,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:39:24,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:39:24,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:39:25,917 INFO [train.py:1039] (0/4) Epoch 1, batch 5200, loss[loss=0.2829, simple_loss=0.3404, pruned_loss=0.1126, over 24356.00 frames. ], tot_loss[loss=0.3386, simple_loss=0.3709, pruned_loss=0.1532, over 4728276.36 frames. ], batch size: 61, lr: 4.08e-02, grad_scale: 32.0 2023-09-28 13:39:26,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:39:26,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:39:27,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:39:30,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:39:32,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:39:35,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:39,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 13:39:41,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:39:41,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=34733.333333333336, ans=0.125 2023-09-28 13:39:42,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:43,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=34733.333333333336, ans=0.0 2023-09-28 13:39:44,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:44,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:39:44,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:46,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 13:39:47,263 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.249e+02 2.894e+02 3.453e+02 4.172e+02 7.980e+02, threshold=6.907e+02, percent-clipped=1.0 2023-09-28 13:39:47,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:39:49,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:51,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 13:39:54,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:39:56,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:39:56,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 13:39:57,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 13:39:59,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 13:40:00,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:00,616 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 13:40:00,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:40:00,844 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=34800.0, ans=0.0 2023-09-28 13:40:02,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:02,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:40:03,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 13:40:03,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:07,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:12,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 13:40:12,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 13:40:12,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 13:40:18,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 13:40:18,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:40:26,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:40:26,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:28,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 13:40:28,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:28,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 13:40:29,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:30,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:40:32,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=34933.333333333336, ans=0.125 2023-09-28 13:40:33,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:33,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:40:37,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:39,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:39,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:47,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:47,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 13:40:47,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:49,068 INFO [train.py:1039] (0/4) Epoch 1, batch 5250, loss[loss=0.3311, simple_loss=0.3634, pruned_loss=0.1494, over 23379.00 frames. ], tot_loss[loss=0.3365, simple_loss=0.3688, pruned_loss=0.1521, over 4715789.70 frames. ], batch size: 105, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:40:49,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:40:49,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:49,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:40:50,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:40:53,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:56,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:56,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:40:58,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:41:03,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:41:05,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:41:10,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:41:11,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:41:13,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 13:41:13,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:41:13,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=35066.666666666664, ans=0.2 2023-09-28 13:41:14,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:41:51,888 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.26 vs. limit=10.0 2023-09-28 13:41:59,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=35266.666666666664, ans=0.125 2023-09-28 13:42:03,684 INFO [train.py:1039] (0/4) Epoch 1, batch 5300, loss[loss=0.2765, simple_loss=0.3286, pruned_loss=0.1121, over 24623.00 frames. ], tot_loss[loss=0.3344, simple_loss=0.3658, pruned_loss=0.1515, over 4694508.31 frames. ], batch size: 60, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:42:18,828 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-1.pt 2023-09-28 13:42:28,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:42:28,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 13:42:28,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 13:42:28,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:28,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:28,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:28,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:28,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:28,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:42:28,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:28,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:42:29,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:42:29,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 13:42:29,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 13:42:29,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 13:42:29,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:42:29,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 13:42:30,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 13:42:30,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:31,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:31,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:31,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:31,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:42:31,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:32,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:32,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:32,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:32,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:32,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:42:32,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:32,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:42:33,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 13:42:33,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:33,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:33,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 13:42:33,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 13:42:33,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:42:34,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:42:34,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 13:42:34,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 13:42:34,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:35,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:42:35,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:35,923 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 13:42:36,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 13:42:36,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:42:36,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:36,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 13:42:36,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 13:42:36,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 13:42:36,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:40,300 INFO [train.py:1039] (0/4) Epoch 2, batch 0, loss[loss=0.3359, simple_loss=0.3628, pruned_loss=0.1545, over 23774.00 frames. ], tot_loss[loss=0.3359, simple_loss=0.3628, pruned_loss=0.1545, over 23774.00 frames. ], batch size: 232, lr: 3.99e-02, grad_scale: 32.0 2023-09-28 13:42:40,301 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 13:42:56,268 INFO [train.py:1071] (0/4) Epoch 2, validation: loss=0.367, simple_loss=0.3421, pruned_loss=0.196, over 1125622.00 frames. 2023-09-28 13:42:56,268 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 13:42:57,799 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.107e+02 3.100e+02 3.616e+02 4.753e+02 9.571e+02, threshold=7.232e+02, percent-clipped=1.0 2023-09-28 13:42:59,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 13:42:59,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:43:02,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:43:06,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:06,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:43:07,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:08,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 13:43:10,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 13:43:13,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:14,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:19,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:43:19,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:19,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=35480.0, ans=0.125 2023-09-28 13:43:22,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 13:43:24,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:30,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=35546.666666666664, ans=0.125 2023-09-28 13:43:32,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=35546.666666666664, ans=0.09899494936611666 2023-09-28 13:43:33,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:43:34,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:35,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 13:43:39,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=35546.666666666664, ans=0.125 2023-09-28 13:43:41,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:43:41,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:43:44,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:43:47,839 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=35613.333333333336, ans=0.125 2023-09-28 13:43:49,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:43:53,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:43:53,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=35613.333333333336, ans=0.035 2023-09-28 13:44:00,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 13:44:02,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 13:44:02,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:02,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:03,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:44:05,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:05,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 13:44:08,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:10,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:15,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:44:19,229 INFO [train.py:1039] (0/4) Epoch 2, batch 50, loss[loss=0.3182, simple_loss=0.3749, pruned_loss=0.1308, over 24537.00 frames. ], tot_loss[loss=0.337, simple_loss=0.3727, pruned_loss=0.1507, over 1076119.53 frames. ], batch size: 71, lr: 3.98e-02, grad_scale: 32.0 2023-09-28 13:44:19,287 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 13:44:19,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:44:22,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:24,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:24,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 13:44:25,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:44:25,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:44:27,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=35746.666666666664, ans=0.125 2023-09-28 13:44:30,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:32,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:35,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:37,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 13:44:37,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:43,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:44:44,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 13:44:47,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 13:44:49,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:44:52,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:44:52,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:52,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:54,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:44:54,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:44:54,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:45:02,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:05,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:05,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:45:05,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 13:45:07,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:45:09,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:45:09,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 13:45:10,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:12,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 13:45:18,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:45:18,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:22,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:22,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:22,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:26,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 13:45:27,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 13:45:27,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:29,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:30,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:45:30,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:30,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 13:45:30,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 13:45:32,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:45:32,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:33,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:45:35,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 13:45:35,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 13:45:35,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:36,154 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.54 vs. limit=15.0 2023-09-28 13:45:37,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:38,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:45:38,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:45:42,084 INFO [train.py:1039] (0/4) Epoch 2, batch 100, loss[loss=0.3676, simple_loss=0.4001, pruned_loss=0.1676, over 24589.00 frames. ], tot_loss[loss=0.3383, simple_loss=0.372, pruned_loss=0.1523, over 1886192.56 frames. ], batch size: 71, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:45:43,573 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.251e+02 2.783e+02 3.462e+02 4.523e+02 1.049e+03, threshold=6.924e+02, percent-clipped=4.0 2023-09-28 13:45:43,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:45:45,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:45:45,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=36080.0, ans=0.0 2023-09-28 13:45:48,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:45:50,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 13:45:50,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:56,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:45:56,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:56,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:56,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:57,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:57,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=36146.666666666664, ans=0.0 2023-09-28 13:45:58,245 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.63 vs. limit=15.0 2023-09-28 13:45:59,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 13:45:59,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=36146.666666666664, ans=0.125 2023-09-28 13:46:00,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:46:00,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:01,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=36146.666666666664, ans=0.0030115942028985515 2023-09-28 13:46:02,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:02,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:46:05,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 13:46:07,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:08,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:09,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=36146.666666666664, ans=0.125 2023-09-28 13:46:10,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:46:10,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=36146.666666666664, ans=0.0 2023-09-28 13:46:12,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:46:15,736 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 13:46:15,775 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 13:46:15,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:46:15,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:46:20,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:46:22,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:23,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:31,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:32,831 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 13:46:34,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:46:39,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:46:41,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:46:42,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:45,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:46,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=36280.0, ans=0.125 2023-09-28 13:46:48,038 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.77 vs. limit=22.5 2023-09-28 13:46:48,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:46:49,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:46:52,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:54,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:54,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:55,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:46:55,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:57,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 13:46:57,320 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 13:46:57,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:58,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:47:01,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:01,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:01,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:47:01,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:47:01,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:47:01,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:03,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:05,407 INFO [train.py:1039] (0/4) Epoch 2, batch 150, loss[loss=0.2731, simple_loss=0.321, pruned_loss=0.1126, over 22468.00 frames. ], tot_loss[loss=0.3344, simple_loss=0.3696, pruned_loss=0.1497, over 2517631.09 frames. ], batch size: 49, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:47:05,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:05,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:47:06,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:47:08,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:14,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:47:14,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:14,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:14,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=36413.333333333336, ans=0.2 2023-09-28 13:47:17,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:47:17,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:17,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=36413.333333333336, ans=0.002953623188405796 2023-09-28 13:47:20,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:47:20,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:22,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=36480.0, ans=0.2 2023-09-28 13:47:25,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 13:47:27,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 13:47:27,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 13:47:30,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:47:30,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:47:31,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:47:33,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:47:33,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:33,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:33,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:34,956 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 13:47:37,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:40,793 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=15.0 2023-09-28 13:47:45,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:47,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:47:47,330 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=36546.666666666664, ans=0.1 2023-09-28 13:47:50,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 13:47:54,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:47:54,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=36613.333333333336, ans=0.125 2023-09-28 13:47:56,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:56,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:47:59,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:48:01,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:48:02,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:48:04,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:06,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 13:48:08,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=36613.333333333336, ans=0.125 2023-09-28 13:48:11,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:11,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:11,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:48:11,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:48:14,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:16,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 13:48:19,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:48:21,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:48:22,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:24,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:48:24,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 13:48:25,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:48:25,041 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 13:48:30,010 INFO [train.py:1039] (0/4) Epoch 2, batch 200, loss[loss=0.2918, simple_loss=0.3298, pruned_loss=0.1269, over 24452.00 frames. ], tot_loss[loss=0.3364, simple_loss=0.3714, pruned_loss=0.1507, over 3007923.75 frames. ], batch size: 58, lr: 3.96e-02, grad_scale: 32.0 2023-09-28 13:48:30,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:31,373 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.151e+02 2.761e+02 3.224e+02 4.160e+02 8.294e+02, threshold=6.447e+02, percent-clipped=1.0 2023-09-28 13:48:31,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=36746.666666666664, ans=0.5 2023-09-28 13:48:33,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:48:34,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:48:38,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 13:48:39,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:40,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:42,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 13:48:44,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:48:45,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:45,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:50,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:48:52,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:52,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:00,119 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.36 vs. limit=22.5 2023-09-28 13:49:10,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:49:10,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:49:11,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:49:13,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:49:13,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:49:13,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:49:13,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:15,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:49:15,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:16,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:18,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 13:49:19,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:49:19,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:23,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:49:27,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:35,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:35,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:49:44,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:45,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 13:49:47,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:47,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:49:47,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:47,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:49:51,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 13:49:52,417 INFO [train.py:1039] (0/4) Epoch 2, batch 250, loss[loss=0.32, simple_loss=0.357, pruned_loss=0.1415, over 23446.00 frames. ], tot_loss[loss=0.3345, simple_loss=0.3692, pruned_loss=0.1499, over 3386572.43 frames. ], batch size: 93, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:49:52,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:49:52,558 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 13:49:54,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:55,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:49:57,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:57,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:50:00,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:50:00,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:50:02,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=37080.0, ans=0.125 2023-09-28 13:50:03,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:50:09,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:50:21,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:24,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:50:24,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:50:31,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:50:33,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:50:34,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:50:34,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:36,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:50:36,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:50:36,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:37,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:50:38,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=37213.333333333336, ans=0.2 2023-09-28 13:50:40,363 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.38 vs. limit=10.0 2023-09-28 13:50:40,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 13:50:40,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:41,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=37280.0, ans=0.002765217391304348 2023-09-28 13:50:43,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:50:43,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:50:43,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:50:45,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:50:46,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:50:46,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:50:46,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=37280.0, ans=0.125 2023-09-28 13:50:49,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:50:51,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:50:52,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:50:57,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:51:01,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:02,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:51:08,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:11,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:51:14,099 INFO [train.py:1039] (0/4) Epoch 2, batch 300, loss[loss=0.3265, simple_loss=0.3767, pruned_loss=0.1382, over 24376.00 frames. ], tot_loss[loss=0.3303, simple_loss=0.3661, pruned_loss=0.1473, over 3691982.87 frames. ], batch size: 77, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:51:14,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 13:51:15,700 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.217e+02 3.009e+02 3.543e+02 4.126e+02 1.008e+03, threshold=7.086e+02, percent-clipped=8.0 2023-09-28 13:51:15,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:51:15,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:51:18,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 13:51:18,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:51:19,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:51:19,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 13:51:19,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=37413.333333333336, ans=0.2 2023-09-28 13:51:22,542 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.69 vs. limit=15.0 2023-09-28 13:51:23,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:24,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:51:29,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:51:29,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 13:51:31,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:31,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:51:31,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 13:51:32,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:34,712 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=37480.0, ans=0.1 2023-09-28 13:51:39,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:51:42,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:51:42,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 13:51:45,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 13:51:47,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:49,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:51,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:51,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 13:51:51,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:51:51,531 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=37546.666666666664, ans=0.0 2023-09-28 13:51:55,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:51:57,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:51:57,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:52:01,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:52:01,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 13:52:04,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:52:07,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:10,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 13:52:10,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:16,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:52:17,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:52:17,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 13:52:18,401 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.33 vs. limit=15.0 2023-09-28 13:52:22,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:22,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:52:25,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:27,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:52:27,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 13:52:29,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:52:29,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:52:30,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 13:52:32,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:34,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:34,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:35,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:36,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:37,455 INFO [train.py:1039] (0/4) Epoch 2, batch 350, loss[loss=0.3291, simple_loss=0.3665, pruned_loss=0.1459, over 23913.00 frames. ], tot_loss[loss=0.3292, simple_loss=0.3648, pruned_loss=0.1468, over 3921700.10 frames. ], batch size: 86, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:52:40,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:40,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:52:43,358 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=37746.666666666664, ans=0.05 2023-09-28 13:52:44,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:50,691 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.12 vs. limit=15.0 2023-09-28 13:52:51,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:53,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=37813.333333333336, ans=0.0 2023-09-28 13:52:53,793 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.60 vs. limit=15.0 2023-09-28 13:52:54,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:54,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:56,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 13:52:57,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:57,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 13:52:59,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:01,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 13:53:03,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:06,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 13:53:06,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=37813.333333333336, ans=0.1 2023-09-28 13:53:08,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:53:10,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:12,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:53:12,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=37880.0, ans=0.1 2023-09-28 13:53:13,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:13,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:15,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:15,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:15,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:53:18,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:53:18,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:25,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:53:25,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:53:26,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:53:26,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:31,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 13:53:31,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:38,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:38,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:38,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:53:39,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 13:53:42,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:42,881 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 13:53:44,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 13:53:44,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:46,115 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.55 vs. limit=12.0 2023-09-28 13:53:48,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:48,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 13:53:49,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:51,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:53:55,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:56,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:56,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:58,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:54:01,045 INFO [train.py:1039] (0/4) Epoch 2, batch 400, loss[loss=0.331, simple_loss=0.3775, pruned_loss=0.1423, over 24025.00 frames. ], tot_loss[loss=0.327, simple_loss=0.3631, pruned_loss=0.1455, over 4111299.55 frames. ], batch size: 80, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:54:01,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:54:02,560 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.905e+02 3.509e+02 4.327e+02 7.986e+02, threshold=7.018e+02, percent-clipped=1.0 2023-09-28 13:54:05,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:54:07,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 13:54:07,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:07,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:09,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:54:09,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:12,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:14,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:17,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 13:54:18,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 13:54:18,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:20,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 13:54:22,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:23,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=38146.666666666664, ans=0.0025768115942028996 2023-09-28 13:54:24,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:54:24,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 13:54:26,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:54:26,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:26,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:30,355 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 13:54:30,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 13:54:35,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:36,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:38,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 13:54:40,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 13:54:42,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:54:45,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:54:45,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=38213.333333333336, ans=0.1 2023-09-28 13:54:51,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 13:54:53,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:54:56,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 13:54:57,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=38280.0, ans=0.0025478260869565214 2023-09-28 13:55:00,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:55:01,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:55:02,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 13:55:05,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:55:08,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:55:10,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:55:13,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:15,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 13:55:17,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:55:18,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 13:55:18,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:55:20,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:55:20,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=38346.666666666664, ans=0.125 2023-09-28 13:55:23,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 13:55:24,188 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.02 vs. limit=22.5 2023-09-28 13:55:24,924 INFO [train.py:1039] (0/4) Epoch 2, batch 450, loss[loss=0.3515, simple_loss=0.3712, pruned_loss=0.1659, over 23514.00 frames. ], tot_loss[loss=0.327, simple_loss=0.3631, pruned_loss=0.1454, over 4245173.52 frames. ], batch size: 256, lr: 3.93e-02, grad_scale: 32.0 2023-09-28 13:55:25,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:55:26,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:55:26,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:55:27,407 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.60 vs. limit=15.0 2023-09-28 13:55:30,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 13:55:30,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:55:31,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:55:31,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:55:31,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 13:55:31,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:55:33,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:55:35,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=38413.333333333336, ans=0.125 2023-09-28 13:55:37,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:55:47,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:47,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:55:49,578 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.50 vs. limit=6.0 2023-09-28 13:55:51,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 13:55:51,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 13:55:55,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:55:55,872 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=38480.0, ans=0.2 2023-09-28 13:55:58,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:58,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:03,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:04,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:08,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 13:56:08,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 13:56:09,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 13:56:10,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:12,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:12,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:56:15,240 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 13:56:15,254 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 13:56:15,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:56:18,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:56:20,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:56:22,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=38613.333333333336, ans=0.2 2023-09-28 13:56:23,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:56:23,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:56:25,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 13:56:25,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 13:56:27,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:30,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:56:30,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:56:32,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 13:56:35,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=38680.0, ans=0.125 2023-09-28 13:56:36,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:56:37,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 13:56:38,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 13:56:39,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:45,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:56:46,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:56:47,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=38746.666666666664, ans=0.2 2023-09-28 13:56:48,167 INFO [train.py:1039] (0/4) Epoch 2, batch 500, loss[loss=0.3302, simple_loss=0.3799, pruned_loss=0.1402, over 24643.00 frames. ], tot_loss[loss=0.3279, simple_loss=0.3645, pruned_loss=0.1456, over 4347377.64 frames. ], batch size: 73, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:56:48,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:56:48,353 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 13:56:50,427 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.109e+02 2.855e+02 3.493e+02 4.304e+02 8.305e+02, threshold=6.986e+02, percent-clipped=1.0 2023-09-28 13:56:53,113 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.20 vs. limit=12.0 2023-09-28 13:56:53,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:53,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:56:55,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:55,154 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 13:56:56,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 13:56:56,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:57:00,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:57:05,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:57:06,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:57:08,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:57:08,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:57:08,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=38813.333333333336, ans=0.1 2023-09-28 13:57:09,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:10,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=38813.333333333336, ans=0.125 2023-09-28 13:57:18,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:18,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 13:57:19,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:57:19,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:21,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 13:57:21,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:57:23,966 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=38880.0, ans=0.125 2023-09-28 13:57:24,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=38880.0, ans=0.0 2023-09-28 13:57:25,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:57:25,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:57:27,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:57:27,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:28,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 13:57:33,658 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 13:57:38,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:57:38,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:39,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:57:43,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 13:57:46,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:57:47,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:57:50,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:57:53,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:56,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=39013.333333333336, ans=0.1 2023-09-28 13:57:58,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:01,158 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=13.02 vs. limit=15.0 2023-09-28 13:58:01,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 13:58:01,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:01,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:03,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=39013.333333333336, ans=0.1 2023-09-28 13:58:04,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=39013.333333333336, ans=0.09899494936611666 2023-09-28 13:58:06,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 13:58:08,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:58:09,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:11,443 INFO [train.py:1039] (0/4) Epoch 2, batch 550, loss[loss=0.3053, simple_loss=0.3442, pruned_loss=0.1332, over 24299.00 frames. ], tot_loss[loss=0.3308, simple_loss=0.3661, pruned_loss=0.1478, over 4424008.15 frames. ], batch size: 56, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:58:14,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 13:58:16,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 13:58:16,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:16,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 13:58:17,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:58:17,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:18,557 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.73 vs. limit=15.0 2023-09-28 13:58:19,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:19,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:20,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:58:20,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:58:21,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=39080.0, ans=0.002373913043478261 2023-09-28 13:58:22,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:23,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 13:58:23,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:58:25,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=39146.666666666664, ans=0.125 2023-09-28 13:58:29,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:30,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:30,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:58:32,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:35,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 13:58:37,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 13:58:38,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:58:43,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:58:43,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:44,092 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.46 vs. limit=22.5 2023-09-28 13:58:44,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:58:47,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:47,885 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 13:58:49,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:50,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 13:58:52,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:54,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:58:54,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:58:54,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=39213.333333333336, ans=0.1 2023-09-28 13:58:55,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:55,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 13:58:59,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 13:58:59,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:59,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:59:01,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:59:01,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:59:04,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:59:06,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:59:11,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:59:14,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:14,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:59:15,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:59:17,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:18,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:59:20,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:22,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:59:22,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:59:29,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 13:59:32,878 INFO [train.py:1039] (0/4) Epoch 2, batch 600, loss[loss=0.3314, simple_loss=0.354, pruned_loss=0.1544, over 23739.00 frames. ], tot_loss[loss=0.3316, simple_loss=0.3665, pruned_loss=0.1483, over 4484240.93 frames. ], batch size: 232, lr: 3.91e-02, grad_scale: 32.0 2023-09-28 13:59:33,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 13:59:33,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=39413.333333333336, ans=0.125 2023-09-28 13:59:34,962 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.924e+02 3.724e+02 4.722e+02 8.175e+02, threshold=7.448e+02, percent-clipped=4.0 2023-09-28 13:59:35,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=39413.333333333336, ans=0.0023014492753623177 2023-09-28 13:59:36,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:59:36,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:59:36,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:41,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=39413.333333333336, ans=0.125 2023-09-28 13:59:44,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:59:45,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:59:47,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 13:59:49,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:59:52,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:59:54,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:55,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 13:59:55,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:00:02,822 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.44 vs. limit=15.0 2023-09-28 14:00:04,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 14:00:06,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:00:06,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:08,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:00:10,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=39546.666666666664, ans=0.125 2023-09-28 14:00:13,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:00:13,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:00:13,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:20,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:00:24,763 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.59 vs. limit=22.5 2023-09-28 14:00:27,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:27,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:00:27,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:28,071 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.29 vs. limit=15.0 2023-09-28 14:00:34,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 14:00:39,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:00:39,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:00:44,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 14:00:46,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:00:48,662 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=39680.0, ans=0.0 2023-09-28 14:00:50,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 14:00:50,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:00:50,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:00:57,133 INFO [train.py:1039] (0/4) Epoch 2, batch 650, loss[loss=0.3538, simple_loss=0.3727, pruned_loss=0.1675, over 23749.00 frames. ], tot_loss[loss=0.3295, simple_loss=0.3641, pruned_loss=0.1474, over 4526882.43 frames. ], batch size: 164, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:00:57,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:01:00,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:01:01,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:03,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:01:04,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:08,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 14:01:08,416 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=39746.666666666664, ans=0.125 2023-09-28 14:01:09,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:01:14,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:01:14,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:17,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:22,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 14:01:25,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:25,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:30,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:01:31,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:01:32,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:33,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:34,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:01:35,330 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=39880.0, ans=0.5 2023-09-28 14:01:36,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:38,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:01:39,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:01:40,942 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 14:01:40,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:40,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:42,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:44,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:44,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:01:45,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:01:47,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 14:01:47,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:01:49,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:49,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:01:49,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:51,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:01:52,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 14:01:54,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 14:01:54,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:54,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:56,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:01:56,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:58,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:02:05,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:05,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:08,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:02:10,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:10,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:02:11,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:19,119 INFO [train.py:1039] (0/4) Epoch 2, batch 700, loss[loss=0.2973, simple_loss=0.3421, pruned_loss=0.1262, over 24460.00 frames. ], tot_loss[loss=0.3271, simple_loss=0.3622, pruned_loss=0.146, over 4572769.60 frames. ], batch size: 63, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:02:19,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:02:19,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:20,671 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.820e+02 3.434e+02 4.210e+02 9.710e+02, threshold=6.868e+02, percent-clipped=2.0 2023-09-28 14:02:20,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:20,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:24,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 14:02:26,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 14:02:29,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 14:02:29,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:32,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:02:35,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 14:02:37,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=40146.666666666664, ans=0.2 2023-09-28 14:02:38,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:42,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:02:43,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:44,458 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.16 vs. limit=15.0 2023-09-28 14:02:45,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:02:45,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=40146.666666666664, ans=0.125 2023-09-28 14:02:46,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:49,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:50,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=40146.666666666664, ans=0.125 2023-09-28 14:02:51,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:02:51,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:02:54,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 14:02:57,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 14:03:01,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:03:02,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:03:03,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:03:07,756 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.16 vs. limit=15.0 2023-09-28 14:03:08,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:03:08,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 14:03:14,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:15,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:03:15,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 14:03:21,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:03:21,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:24,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:03:25,546 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.11 vs. limit=22.5 2023-09-28 14:03:29,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:03:29,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 14:03:35,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 14:03:35,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 14:03:38,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:40,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:40,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:03:40,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=40346.666666666664, ans=0.125 2023-09-28 14:03:41,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:41,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 14:03:43,870 INFO [train.py:1039] (0/4) Epoch 2, batch 750, loss[loss=0.3622, simple_loss=0.3814, pruned_loss=0.1715, over 23779.00 frames. ], tot_loss[loss=0.3254, simple_loss=0.3607, pruned_loss=0.145, over 4598635.22 frames. ], batch size: 164, lr: 3.89e-02, grad_scale: 32.0 2023-09-28 14:03:45,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 14:03:47,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 14:03:47,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 14:03:48,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 14:03:48,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 14:03:48,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:03:51,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 14:03:53,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:53,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:03:54,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:03:56,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:56,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:03:56,597 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=40413.333333333336, ans=0.0 2023-09-28 14:03:57,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:59,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:04:00,334 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.90 vs. limit=22.5 2023-09-28 14:04:00,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:04:02,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:04:05,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:05,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:07,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 14:04:09,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:04:11,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:11,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=40480.0, ans=0.05 2023-09-28 14:04:14,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:14,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:04:16,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 14:04:16,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:04:19,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 14:04:19,642 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 14:04:19,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 14:04:21,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:04:21,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:04:22,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:04:23,292 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.56 vs. limit=15.0 2023-09-28 14:04:26,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=40546.666666666664, ans=0.125 2023-09-28 14:04:28,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:04:28,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:28,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:04:32,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:34,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:04:35,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 14:04:35,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:04:36,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 14:04:36,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:04:39,785 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.39 vs. limit=15.0 2023-09-28 14:04:40,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:04:42,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 14:04:44,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:48,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:04:49,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=40680.0, ans=0.0020260869565217384 2023-09-28 14:04:51,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:04:51,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:54,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:04:57,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 14:04:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:04:59,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:01,509 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.93 vs. limit=15.0 2023-09-28 14:05:02,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:02,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:02,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=40680.0, ans=0.0 2023-09-28 14:05:05,502 INFO [train.py:1039] (0/4) Epoch 2, batch 800, loss[loss=0.3224, simple_loss=0.352, pruned_loss=0.1464, over 23466.00 frames. ], tot_loss[loss=0.3268, simple_loss=0.3619, pruned_loss=0.1459, over 4614578.27 frames. ], batch size: 134, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:05:05,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:05,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:05:07,068 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.053e+02 2.783e+02 3.464e+02 4.160e+02 6.985e+02, threshold=6.929e+02, percent-clipped=3.0 2023-09-28 14:05:14,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:14,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:17,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:05:17,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:18,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:20,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:22,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:26,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:27,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:05:30,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 14:05:31,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:31,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:33,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:05:33,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:34,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 14:05:34,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:34,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 14:05:37,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:39,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:41,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:41,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:45,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:45,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:50,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=40880.0, ans=0.125 2023-09-28 14:05:52,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:05:52,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:05:52,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 14:05:53,862 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 14:05:53,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 14:05:53,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:05:53,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:57,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:57,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:06:00,094 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=40946.666666666664, ans=0.125 2023-09-28 14:06:00,594 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.84 vs. limit=10.0 2023-09-28 14:06:03,298 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 14:06:03,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 14:06:06,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:06:07,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:06:11,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:06:11,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=41013.333333333336, ans=0.1 2023-09-28 14:06:15,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:15,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 14:06:17,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:06:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 14:06:25,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:27,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:06:27,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 14:06:29,437 INFO [train.py:1039] (0/4) Epoch 2, batch 850, loss[loss=0.2776, simple_loss=0.3285, pruned_loss=0.1134, over 24578.00 frames. ], tot_loss[loss=0.327, simple_loss=0.3619, pruned_loss=0.1461, over 4632661.16 frames. ], batch size: 60, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:06:29,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:06:29,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:31,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 14:06:31,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:31,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:06:31,826 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.69 vs. limit=15.0 2023-09-28 14:06:33,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:37,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:06:37,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:06:38,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 14:06:40,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 14:06:40,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 14:06:41,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:41,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:06:44,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:44,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:46,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:06:48,392 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.33 vs. limit=6.0 2023-09-28 14:06:50,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:50,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:50,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 14:06:54,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 14:06:59,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:07:01,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 14:07:01,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=41213.333333333336, ans=0.09899494936611666 2023-09-28 14:07:03,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=41213.333333333336, ans=0.125 2023-09-28 14:07:05,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 14:07:05,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 14:07:07,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=41213.333333333336, ans=0.0 2023-09-28 14:07:08,412 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 14:07:08,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:08,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:07:08,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:07:11,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:13,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:13,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 14:07:17,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:19,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:20,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:07:20,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:07:21,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:07:22,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:07:22,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 14:07:23,034 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=41280.0, ans=0.0 2023-09-28 14:07:27,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=41280.0, ans=0.2 2023-09-28 14:07:28,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:07:28,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:28,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:07:30,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:32,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:34,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:35,894 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=41346.666666666664, ans=0.0018811594202898553 2023-09-28 14:07:37,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:07:39,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:07:39,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:07:41,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:07:49,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:07:49,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:51,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 14:07:51,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:07:51,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:53,226 INFO [train.py:1039] (0/4) Epoch 2, batch 900, loss[loss=0.3125, simple_loss=0.3675, pruned_loss=0.1288, over 24604.00 frames. ], tot_loss[loss=0.3266, simple_loss=0.3625, pruned_loss=0.1454, over 4668929.93 frames. ], batch size: 68, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:07:54,703 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.862e+02 3.366e+02 4.167e+02 7.237e+02, threshold=6.733e+02, percent-clipped=1.0 2023-09-28 14:07:54,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 14:07:58,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=41413.333333333336, ans=0.0018666666666666658 2023-09-28 14:07:59,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:08:02,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:02,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 14:08:06,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:08:07,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 14:08:09,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:08:09,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:08:09,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:10,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:08:11,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:08:24,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:24,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:24,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:08:28,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:31,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 14:08:33,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:08:36,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=41546.666666666664, ans=0.1 2023-09-28 14:08:39,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:08:40,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:08:42,573 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 14:08:42,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 14:08:43,084 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:08:47,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:08:47,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:08:49,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:08:58,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:58,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:08:59,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 14:08:59,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:09:04,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 14:09:06,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:09:06,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:07,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:09:07,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:09,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 14:09:11,095 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 14:09:13,751 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.43 vs. limit=15.0 2023-09-28 14:09:14,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:09:14,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 14:09:16,241 INFO [train.py:1039] (0/4) Epoch 2, batch 950, loss[loss=0.3321, simple_loss=0.3718, pruned_loss=0.1462, over 24457.00 frames. ], tot_loss[loss=0.3266, simple_loss=0.3624, pruned_loss=0.1454, over 4668859.61 frames. ], batch size: 69, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:09:17,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:21,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 14:09:26,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:26,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=41746.666666666664, ans=0.015 2023-09-28 14:09:30,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:30,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:31,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:09:33,428 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 14:09:36,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=41813.333333333336, ans=0.04949747468305833 2023-09-28 14:09:37,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:38,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:40,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:40,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:09:40,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 14:09:41,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:09:43,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:44,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 14:09:45,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=41813.333333333336, ans=0.0 2023-09-28 14:09:46,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:49,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:49,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:51,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:51,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 14:09:52,076 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=41880.0, ans=0.0017652173913043478 2023-09-28 14:09:53,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:09:55,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:56,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:10:02,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:02,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:10:06,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 14:10:09,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:10:09,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:10:09,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:10,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:10,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:10:13,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 14:10:16,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:10:18,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:19,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:19,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 14:10:19,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:19,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:10:21,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 14:10:21,800 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.79 vs. limit=15.0 2023-09-28 14:10:23,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=42013.333333333336, ans=0.2 2023-09-28 14:10:26,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:10:29,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:35,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:37,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 14:10:37,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 14:10:37,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=42013.333333333336, ans=0.0 2023-09-28 14:10:39,876 INFO [train.py:1039] (0/4) Epoch 2, batch 1000, loss[loss=0.3133, simple_loss=0.3736, pruned_loss=0.1265, over 24329.00 frames. ], tot_loss[loss=0.3256, simple_loss=0.3612, pruned_loss=0.145, over 4679141.16 frames. ], batch size: 74, lr: 3.86e-02, grad_scale: 16.0 2023-09-28 14:10:41,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:42,872 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.243e+02 2.891e+02 3.339e+02 3.802e+02 9.955e+02, threshold=6.678e+02, percent-clipped=4.0 2023-09-28 14:10:44,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 14:10:44,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:51,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:10:51,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=42080.0, ans=0.2 2023-09-28 14:10:52,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 14:10:52,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 14:10:57,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:10:57,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:59,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:03,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 14:11:06,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 14:11:09,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 14:11:09,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:11,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 14:11:13,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 14:11:13,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 14:11:13,809 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=19.12 vs. limit=15.0 2023-09-28 14:11:14,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:16,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:22,528 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.16 vs. limit=15.0 2023-09-28 14:11:23,903 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.55 vs. limit=15.0 2023-09-28 14:11:23,984 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.75 vs. limit=22.5 2023-09-28 14:11:24,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:24,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:11:26,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:26,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=42213.333333333336, ans=0.125 2023-09-28 14:11:27,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:27,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 14:11:27,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:29,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:11:29,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:31,421 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 14:11:34,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 14:11:35,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 14:11:37,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 14:11:39,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:11:46,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:46,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:11:46,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:48,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:11:49,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 14:11:51,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:11:51,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 14:11:51,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 14:11:52,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:11:52,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:53,660 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.93 vs. limit=15.0 2023-09-28 14:11:54,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:11:58,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:11:58,845 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=17.00 vs. limit=15.0 2023-09-28 14:11:59,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:03,236 INFO [train.py:1039] (0/4) Epoch 2, batch 1050, loss[loss=0.2945, simple_loss=0.3496, pruned_loss=0.1197, over 24460.00 frames. ], tot_loss[loss=0.3241, simple_loss=0.3601, pruned_loss=0.1441, over 4695378.16 frames. ], batch size: 66, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:12:04,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:12:06,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:12:08,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:12:09,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:11,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:13,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:12:15,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:12:16,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:12:18,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:12:18,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:12:20,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:12:20,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 14:12:20,655 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=42480.0, ans=0.125 2023-09-28 14:12:21,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:21,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 14:12:22,614 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.55 vs. limit=22.5 2023-09-28 14:12:24,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:12:24,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 14:12:24,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:12:25,249 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=42480.0, ans=0.05 2023-09-28 14:12:31,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:33,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:12:35,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:38,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 14:12:38,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 14:12:39,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:40,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=42546.666666666664, ans=0.125 2023-09-28 14:12:42,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 14:12:45,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 14:12:46,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:49,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:12:53,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:12:53,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:12:55,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:12:58,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:13:03,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 14:13:04,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 14:13:05,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 14:13:06,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:06,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:13:08,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 14:13:10,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=42680.0, ans=0.125 2023-09-28 14:13:12,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:13:15,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:15,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:17,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:17,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:20,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:20,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 14:13:22,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:22,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 14:13:22,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 14:13:24,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:13:26,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=42746.666666666664, ans=0.125 2023-09-28 14:13:27,016 INFO [train.py:1039] (0/4) Epoch 2, batch 1100, loss[loss=0.3261, simple_loss=0.356, pruned_loss=0.1481, over 23751.00 frames. ], tot_loss[loss=0.3223, simple_loss=0.3594, pruned_loss=0.1426, over 4710388.64 frames. ], batch size: 232, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:13:29,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:13:29,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=42746.666666666664, ans=0.125 2023-09-28 14:13:30,548 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.833e+02 3.199e+02 3.709e+02 7.263e+02, threshold=6.397e+02, percent-clipped=1.0 2023-09-28 14:13:33,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:13:39,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:13:41,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:13:41,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:41,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 14:13:42,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:13:45,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=42813.333333333336, ans=0.05 2023-09-28 14:13:47,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:13:48,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=42813.333333333336, ans=0.02 2023-09-28 14:13:50,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:13:50,427 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=42813.333333333336, ans=0.125 2023-09-28 14:13:53,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:13:53,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 14:13:55,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:13:56,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:56,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:59,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:14:00,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:14:06,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:14:08,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 14:14:09,838 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 14:14:09,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:13,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:14,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:14:14,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:14:16,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 14:14:17,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:14:17,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:14:17,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:14:17,948 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=42946.666666666664, ans=0.5 2023-09-28 14:14:19,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:19,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 14:14:23,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=42946.666666666664, ans=0.1 2023-09-28 14:14:27,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:14:27,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 14:14:28,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:14:34,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:14:36,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=43013.333333333336, ans=22.5 2023-09-28 14:14:37,894 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.21 vs. limit=22.5 2023-09-28 14:14:38,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 14:14:38,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:14:38,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:42,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:14:42,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:44,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 14:14:44,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:14:45,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:47,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 14:14:47,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:14:47,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 14:14:48,586 INFO [train.py:1039] (0/4) Epoch 2, batch 1150, loss[loss=0.3263, simple_loss=0.3584, pruned_loss=0.1471, over 23667.00 frames. ], tot_loss[loss=0.323, simple_loss=0.3596, pruned_loss=0.1432, over 4706764.24 frames. ], batch size: 149, lr: 3.84e-02, grad_scale: 16.0 2023-09-28 14:14:48,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:14:48,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:14:50,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:14:55,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:14:59,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:15:00,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:00,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:15:00,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 14:15:02,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:04,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 14:15:05,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:05,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:15:10,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 14:15:12,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:17,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:18,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:18,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 14:15:18,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:15:18,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:20,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=43213.333333333336, ans=0.125 2023-09-28 14:15:21,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 14:15:22,120 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:15:23,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:25,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:30,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=43213.333333333336, ans=0.125 2023-09-28 14:15:35,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:42,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:43,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 14:15:45,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:45,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:46,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=43280.0, ans=0.001460869565217392 2023-09-28 14:15:53,319 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 14:15:54,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:16:01,819 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 14:16:06,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:08,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:16:08,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:16:09,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:16:11,795 INFO [train.py:1039] (0/4) Epoch 2, batch 1200, loss[loss=0.3171, simple_loss=0.3664, pruned_loss=0.1339, over 24357.00 frames. ], tot_loss[loss=0.3251, simple_loss=0.361, pruned_loss=0.1447, over 4702426.10 frames. ], batch size: 77, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:16:13,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:15,001 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.963e+02 2.991e+02 3.527e+02 4.351e+02 6.174e+02, threshold=7.053e+02, percent-clipped=0.0 2023-09-28 14:16:18,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:16:18,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:16:19,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:19,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:21,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:16:21,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:16:25,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:16:26,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:26,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:16:28,549 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 14:16:28,919 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=43480.0, ans=0.0014173913043478252 2023-09-28 14:16:32,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 14:16:36,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:16:38,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:16:40,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:44,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:16:44,844 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 14:16:44,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:54,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:16:54,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:16:54,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 14:16:56,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:16:59,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 14:17:04,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 14:17:04,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:17:06,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:17:07,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:07,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:17:09,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:17:09,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:17:12,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:17:12,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 14:17:14,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:17:14,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:14,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:17:17,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:17,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:21,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:17:24,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:17:27,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 14:17:31,154 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 14:17:31,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=43680.0, ans=0.125 2023-09-28 14:17:34,022 INFO [train.py:1039] (0/4) Epoch 2, batch 1250, loss[loss=0.3126, simple_loss=0.3665, pruned_loss=0.1293, over 24542.00 frames. ], tot_loss[loss=0.3269, simple_loss=0.3625, pruned_loss=0.1456, over 4691105.32 frames. ], batch size: 71, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:17:34,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:17:34,626 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=43746.666666666664, ans=0.0 2023-09-28 14:17:35,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:37,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:17:39,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:42,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 14:17:47,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:17:47,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:48,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 14:17:49,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:17:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:17:54,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:17:55,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:56,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:17:56,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:17:58,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:18:02,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:18:02,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:18:02,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:04,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:18:05,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:09,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:10,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:18:18,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 14:18:19,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:18:19,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=43880.0, ans=0.0 2023-09-28 14:18:21,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:21,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 14:18:22,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=43880.0, ans=0.1 2023-09-28 14:18:23,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:18:23,328 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 14:18:23,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:23,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:24,122 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.20 vs. limit=15.0 2023-09-28 14:18:29,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:33,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:33,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:18:36,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 14:18:36,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 14:18:36,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 14:18:39,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:18:41,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 14:18:42,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:43,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=44013.333333333336, ans=0.2 2023-09-28 14:18:44,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:18:44,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:18:46,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 14:18:46,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:18:46,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:18:46,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:18:47,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:50,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 14:18:53,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:55,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:18:56,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:18:58,501 INFO [train.py:1039] (0/4) Epoch 2, batch 1300, loss[loss=0.3373, simple_loss=0.3432, pruned_loss=0.1657, over 22621.00 frames. ], tot_loss[loss=0.3275, simple_loss=0.3628, pruned_loss=0.1461, over 4693099.41 frames. ], batch size: 322, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:18:58,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:19:01,553 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.943e+02 3.508e+02 4.700e+02 1.321e+03, threshold=7.016e+02, percent-clipped=7.0 2023-09-28 14:19:01,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:19:01,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 14:19:07,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:08,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:19:08,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:10,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:19:11,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:19:13,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 14:19:16,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=44146.666666666664, ans=0.125 2023-09-28 14:19:18,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:19:18,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:19:21,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 14:19:25,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:19:30,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:32,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:32,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:34,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:35,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:19:35,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:19:37,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 14:19:41,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=44213.333333333336, ans=0.0 2023-09-28 14:19:42,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:19:43,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:19:44,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 14:19:45,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:19:47,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:19:49,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:51,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 14:19:51,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:51,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 14:19:51,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=44280.0, ans=0.125 2023-09-28 14:19:54,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:58,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:58,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:20:01,128 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=44280.0, ans=15.0 2023-09-28 14:20:02,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=44280.0, ans=0.125 2023-09-28 14:20:03,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 14:20:04,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 14:20:04,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 14:20:08,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=44346.666666666664, ans=0.125 2023-09-28 14:20:09,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:20:13,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 14:20:15,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:20,907 INFO [train.py:1039] (0/4) Epoch 2, batch 1350, loss[loss=0.3676, simple_loss=0.3911, pruned_loss=0.1721, over 23689.00 frames. ], tot_loss[loss=0.3265, simple_loss=0.3617, pruned_loss=0.1456, over 4692293.24 frames. ], batch size: 85, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:20:21,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 14:20:25,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:27,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:31,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:31,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:32,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:20:33,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=44413.333333333336, ans=0.125 2023-09-28 14:20:34,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:38,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:41,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 14:20:41,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=44480.0, ans=0.0 2023-09-28 14:20:43,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:20:43,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:20:45,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 14:20:48,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:20:48,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:20:48,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 14:20:51,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 14:20:53,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 14:20:54,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:54,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 14:21:07,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:12,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=44613.333333333336, ans=0.0 2023-09-28 14:21:16,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:16,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:16,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 14:21:22,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:24,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 14:21:24,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:21:24,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:21:27,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:21:30,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 14:21:31,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:21:38,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 14:21:40,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 14:21:43,241 INFO [train.py:1039] (0/4) Epoch 2, batch 1400, loss[loss=0.3006, simple_loss=0.3512, pruned_loss=0.1251, over 24449.00 frames. ], tot_loss[loss=0.3237, simple_loss=0.3594, pruned_loss=0.144, over 4701514.69 frames. ], batch size: 63, lr: 3.81e-02, grad_scale: 32.0 2023-09-28 14:21:43,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=44746.666666666664, ans=0.0 2023-09-28 14:21:45,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 14:21:46,399 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.080e+02 2.861e+02 3.179e+02 3.709e+02 7.568e+02, threshold=6.358e+02, percent-clipped=1.0 2023-09-28 14:21:46,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:51,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:21:51,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:21:58,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 14:21:58,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 14:21:59,150 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.32 vs. limit=6.0 2023-09-28 14:22:04,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=44813.333333333336, ans=0.1 2023-09-28 14:22:07,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:22:11,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:13,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:22:13,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:22:13,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=44813.333333333336, ans=0.1 2023-09-28 14:22:14,314 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=16.60 vs. limit=15.0 2023-09-28 14:22:18,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:22:18,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:22:29,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:29,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:34,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 14:22:36,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:22:37,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:22:37,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:22:39,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:39,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=44946.666666666664, ans=0.125 2023-09-28 14:22:40,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:22:40,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:22:40,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:22:43,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 14:22:43,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:22:47,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:51,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:22:59,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=45013.333333333336, ans=0.125 2023-09-28 14:23:00,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 14:23:02,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:23:02,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:23:04,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:23:05,539 INFO [train.py:1039] (0/4) Epoch 2, batch 1450, loss[loss=0.3354, simple_loss=0.3735, pruned_loss=0.1487, over 23982.00 frames. ], tot_loss[loss=0.3222, simple_loss=0.3584, pruned_loss=0.143, over 4705790.51 frames. ], batch size: 80, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:23:05,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:05,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:23:10,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:23:10,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:23:10,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:10,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:23:16,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:18,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:23:18,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=45080.0, ans=0.125 2023-09-28 14:23:20,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:23:20,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 14:23:21,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:23:22,803 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.55 vs. limit=10.0 2023-09-28 14:23:23,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 14:23:24,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:25,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:25,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 14:23:27,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:28,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:23:28,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 14:23:28,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:30,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:23:30,876 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.53 vs. limit=22.5 2023-09-28 14:23:31,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:35,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:37,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:23:37,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:23:41,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:41,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:43,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=45213.333333333336, ans=0.001040579710144927 2023-09-28 14:23:44,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:44,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:23:44,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:45,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:23:48,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 14:23:49,279 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=45213.333333333336, ans=0.125 2023-09-28 14:23:52,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:54,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=45280.0, ans=0.0 2023-09-28 14:23:55,768 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 14:23:57,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:23:59,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:24:00,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:01,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=45280.0, ans=0.1 2023-09-28 14:24:02,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 14:24:04,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:06,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 14:24:08,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 14:24:09,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:13,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:13,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:24:14,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 14:24:17,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 14:24:17,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 14:24:19,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:20,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:24:21,254 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=45346.666666666664, ans=0.125 2023-09-28 14:24:30,453 INFO [train.py:1039] (0/4) Epoch 2, batch 1500, loss[loss=0.3269, simple_loss=0.3569, pruned_loss=0.1485, over 23824.00 frames. ], tot_loss[loss=0.3218, simple_loss=0.3582, pruned_loss=0.1427, over 4709212.12 frames. ], batch size: 212, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:24:30,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 14:24:30,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:24:30,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:24:32,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=45413.333333333336, ans=0.1 2023-09-28 14:24:33,281 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.694e+02 3.191e+02 3.911e+02 7.189e+02, threshold=6.382e+02, percent-clipped=1.0 2023-09-28 14:24:33,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:33,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:35,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:24:35,776 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=45413.333333333336, ans=0.1 2023-09-28 14:24:35,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=45413.333333333336, ans=0.0 2023-09-28 14:24:37,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 14:24:37,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=45413.333333333336, ans=0.125 2023-09-28 14:24:38,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:24:40,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:24:40,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:40,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:41,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:24:43,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:43,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=45413.333333333336, ans=0.2 2023-09-28 14:24:45,458 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=2.90 vs. limit=15.0 2023-09-28 14:24:49,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:49,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 14:24:50,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:24:51,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:24:51,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:56,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 14:25:02,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 14:25:04,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:25:06,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 14:25:08,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:25:08,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=45546.666666666664, ans=0.125 2023-09-28 14:25:08,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=45546.666666666664, ans=0.125 2023-09-28 14:25:11,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:11,842 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.99 vs. limit=6.0 2023-09-28 14:25:12,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:25:13,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:25:15,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 14:25:16,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:25:16,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:16,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=45546.666666666664, ans=0.2 2023-09-28 14:25:16,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=45546.666666666664, ans=0.125 2023-09-28 14:25:16,458 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=45546.666666666664, ans=0.0 2023-09-28 14:25:17,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 14:25:17,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:22,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:25:22,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 14:25:30,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:25:31,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:25:35,121 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 14:25:35,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:35,220 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 14:25:38,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:25:40,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:25:41,963 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 14:25:42,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:25:45,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 14:25:46,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:50,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:51,692 INFO [train.py:1039] (0/4) Epoch 2, batch 1550, loss[loss=0.2872, simple_loss=0.329, pruned_loss=0.1227, over 24466.00 frames. ], tot_loss[loss=0.3205, simple_loss=0.3581, pruned_loss=0.1414, over 4722863.62 frames. ], batch size: 58, lr: 3.79e-02, grad_scale: 32.0 2023-09-28 14:25:51,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:51,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:52,076 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=45746.666666666664, ans=0.2 2023-09-28 14:25:53,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:53,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:53,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=45746.666666666664, ans=0.125 2023-09-28 14:25:54,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 14:25:56,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 14:25:56,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:25:57,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 14:25:59,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 14:26:01,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:02,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:04,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:04,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:26:06,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:06,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:09,491 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 14:26:09,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:10,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:26:11,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:26:14,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:26:14,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 14:26:16,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:18,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 14:26:18,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 14:26:18,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 14:26:18,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:18,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=45813.333333333336, ans=0.125 2023-09-28 14:26:19,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:21,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=45813.333333333336, ans=0.2 2023-09-28 14:26:24,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:26:27,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 14:26:27,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 14:26:29,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=45880.0, ans=0.1 2023-09-28 14:26:31,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=45880.0, ans=0.0 2023-09-28 14:26:35,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:38,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:38,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:26:38,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:26:39,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 14:26:45,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:26:45,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:49,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:26:51,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:26:53,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:53,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 14:26:54,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:26:56,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:26:56,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:58,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:26:58,530 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 14:27:00,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:06,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 14:27:07,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=46013.333333333336, ans=0.1 2023-09-28 14:27:12,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:14,230 INFO [train.py:1039] (0/4) Epoch 2, batch 1600, loss[loss=0.312, simple_loss=0.3471, pruned_loss=0.1385, over 23539.00 frames. ], tot_loss[loss=0.3226, simple_loss=0.3593, pruned_loss=0.143, over 4718047.51 frames. ], batch size: 134, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:27:15,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:27:15,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 14:27:17,230 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.915e+02 2.937e+02 3.574e+02 4.472e+02 6.126e+02, threshold=7.147e+02, percent-clipped=0.0 2023-09-28 14:27:17,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:27:17,750 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:27:18,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:18,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:27:18,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:27:19,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:27:22,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:24,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 14:27:26,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 14:27:27,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 14:27:29,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:27:29,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 14:27:31,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:27:34,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:27:38,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:27:42,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 14:27:42,872 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=46146.666666666664, ans=0.125 2023-09-28 14:27:44,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=46146.666666666664, ans=0.125 2023-09-28 14:27:45,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:27:45,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 14:27:46,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:49,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 14:27:49,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=46213.333333333336, ans=0.025 2023-09-28 14:27:55,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 14:27:55,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=46213.333333333336, ans=0.125 2023-09-28 14:28:02,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:05,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 14:28:07,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:07,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:07,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:28:10,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 14:28:14,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=46280.0, ans=0.0 2023-09-28 14:28:15,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 14:28:17,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:28:18,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:28:20,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:28:21,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:28:23,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:28:30,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:32,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:28:33,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 14:28:33,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:28:34,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=46346.666666666664, ans=0.125 2023-09-28 14:28:35,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 14:28:37,321 INFO [train.py:1039] (0/4) Epoch 2, batch 1650, loss[loss=0.2924, simple_loss=0.3354, pruned_loss=0.1248, over 22130.00 frames. ], tot_loss[loss=0.3226, simple_loss=0.3596, pruned_loss=0.1428, over 4721943.11 frames. ], batch size: 48, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:28:40,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:42,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:28:43,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:28:43,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 14:28:43,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 14:28:43,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 14:28:45,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 14:28:47,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:47,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:49,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:28:49,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:28:52,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:54,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 14:28:54,255 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=46480.0, ans=0.1 2023-09-28 14:28:57,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:57,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:57,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:28:57,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:28:57,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 14:28:57,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 14:29:04,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:29:08,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:29:16,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 14:29:17,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:18,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=46546.666666666664, ans=0.125 2023-09-28 14:29:21,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 14:29:24,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:26,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:29:26,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:29:27,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:29,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:29:29,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:29:32,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:34,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:34,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:36,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:29:38,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:38,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 14:29:38,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=46613.333333333336, ans=0.07 2023-09-28 14:29:41,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:41,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 14:29:43,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 14:29:43,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 14:29:43,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:44,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:29:44,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:46,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:46,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 14:29:51,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:54,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:29:54,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:55,200 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.99 vs. limit=15.0 2023-09-28 14:29:56,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 14:30:00,461 INFO [train.py:1039] (0/4) Epoch 2, batch 1700, loss[loss=0.3077, simple_loss=0.3607, pruned_loss=0.1274, over 24619.00 frames. ], tot_loss[loss=0.3221, simple_loss=0.3595, pruned_loss=0.1424, over 4729090.66 frames. ], batch size: 68, lr: 3.77e-02, grad_scale: 16.0 2023-09-28 14:30:00,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:30:00,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:30:00,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 14:30:02,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:02,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:30:02,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:05,053 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.163e+02 2.907e+02 3.272e+02 3.830e+02 6.451e+02, threshold=6.545e+02, percent-clipped=0.0 2023-09-28 14:30:05,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:30:05,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:30:05,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 14:30:10,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:30:14,698 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.18 vs. limit=15.0 2023-09-28 14:30:17,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=46813.333333333336, ans=0.125 2023-09-28 14:30:18,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:18,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=46813.333333333336, ans=0.1 2023-09-28 14:30:18,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=46813.333333333336, ans=0.125 2023-09-28 14:30:21,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:30:28,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:30:28,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:28,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:30,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:30:31,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 14:30:35,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:30:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:36,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:30:37,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=46880.0, ans=0.125 2023-09-28 14:30:38,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:30:41,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 14:30:41,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 14:30:43,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:43,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 14:30:45,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:30:52,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=46946.666666666664, ans=0.1 2023-09-28 14:30:54,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:30:54,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:30:55,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:57,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:30:57,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 14:30:57,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:31:00,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:00,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 14:31:00,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:00,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:01,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:02,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:03,280 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=24.52 vs. limit=22.5 2023-09-28 14:31:05,043 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=15.90 vs. limit=15.0 2023-09-28 14:31:05,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:05,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:31:05,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:07,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:31:07,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:10,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:12,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 14:31:15,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:15,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=47013.333333333336, ans=0.125 2023-09-28 14:31:17,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:18,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 14:31:24,033 INFO [train.py:1039] (0/4) Epoch 2, batch 1750, loss[loss=0.2717, simple_loss=0.3217, pruned_loss=0.1108, over 24619.00 frames. ], tot_loss[loss=0.3199, simple_loss=0.3562, pruned_loss=0.1417, over 4704397.51 frames. ], batch size: 60, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:31:27,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:30,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:30,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:31:31,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 14:31:31,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:35,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:31:35,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:40,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 14:31:43,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:44,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 14:31:46,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:47,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:31:49,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:31:51,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 14:31:53,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:53,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 14:32:01,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:32:05,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:05,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:08,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:08,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:11,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:32:12,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:14,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:16,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:32:17,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 14:32:20,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:22,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 14:32:24,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:27,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:27,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:32:32,366 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=26.34 vs. limit=22.5 2023-09-28 14:32:33,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:32:33,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:32:34,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:36,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:42,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:44,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:32:44,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:32:44,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 14:32:44,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:45,825 INFO [train.py:1039] (0/4) Epoch 2, batch 1800, loss[loss=0.3151, simple_loss=0.3414, pruned_loss=0.1444, over 22826.00 frames. ], tot_loss[loss=0.3195, simple_loss=0.3568, pruned_loss=0.1411, over 4711847.84 frames. ], batch size: 322, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:32:47,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:32:47,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:32:47,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:32:47,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:32:49,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:32:50,424 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.289e+02 2.760e+02 3.253e+02 3.974e+02 7.457e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 14:32:51,489 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.92 vs. limit=6.0 2023-09-28 14:32:52,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:32:52,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:53,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:32:56,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:57,034 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=47413.333333333336, ans=0.125 2023-09-28 14:33:00,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:33:02,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:33:05,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:09,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:09,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:11,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:33:12,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:33:12,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 14:33:16,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:19,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:22,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 14:33:25,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 14:33:25,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 14:33:25,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:26,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=47546.666666666664, ans=0.125 2023-09-28 14:33:27,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:27,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:33:27,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:33:35,901 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 14:33:38,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:33:39,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:41,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 14:33:41,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 14:33:43,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:33:44,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:33:46,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:33:49,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 14:33:52,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=47680.0, ans=0.125 2023-09-28 14:33:56,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:33:58,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 14:33:59,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:33:59,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:59,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:34:01,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 14:34:04,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:34:04,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:07,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 14:34:07,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:09,036 INFO [train.py:1039] (0/4) Epoch 2, batch 1850, loss[loss=0.3209, simple_loss=0.3715, pruned_loss=0.1351, over 24318.00 frames. ], tot_loss[loss=0.3207, simple_loss=0.3578, pruned_loss=0.1418, over 4709263.66 frames. ], batch size: 74, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:34:10,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:10,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:34:10,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:12,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:12,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:34:14,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=47746.666666666664, ans=0.1 2023-09-28 14:34:16,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:34:16,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:18,730 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=47746.666666666664, ans=0.125 2023-09-28 14:34:19,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:34:20,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=47746.666666666664, ans=10.0 2023-09-28 14:34:21,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:34:29,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:34:29,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 14:34:32,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 14:34:34,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 14:34:37,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:37,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 14:34:37,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:34:39,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=47813.333333333336, ans=0.125 2023-09-28 14:34:47,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:34:49,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 14:34:51,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=47880.0, ans=0.2 2023-09-28 14:34:54,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:34:54,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:34:58,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 14:34:59,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:59,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:34:59,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:35:02,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:35:05,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:07,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:35:08,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:08,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:35:08,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:10,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:11,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=47946.666666666664, ans=0.1 2023-09-28 14:35:13,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:35:16,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 14:35:18,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:22,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:35:22,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:35:22,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 14:35:22,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 14:35:24,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=48013.333333333336, ans=0.2 2023-09-28 14:35:25,333 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 14:35:25,471 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 14:35:29,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:35:29,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:35:29,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:29,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:29,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=48013.333333333336, ans=0.00043188405797101384 2023-09-28 14:35:31,267 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 14:35:31,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:35:31,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:32,681 INFO [train.py:1039] (0/4) Epoch 2, batch 1900, loss[loss=0.3429, simple_loss=0.366, pruned_loss=0.1599, over 23448.00 frames. ], tot_loss[loss=0.3205, simple_loss=0.3577, pruned_loss=0.1416, over 4721455.88 frames. ], batch size: 285, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:35:32,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:35:32,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:35:34,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:35:34,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 14:35:37,578 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.980e+02 2.945e+02 3.315e+02 3.866e+02 6.379e+02, threshold=6.630e+02, percent-clipped=0.0 2023-09-28 14:35:37,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:37,734 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 14:35:37,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:35:39,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:45,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:48,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:35:50,203 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 14:35:50,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 14:35:51,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:53,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:53,250 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 14:35:53,293 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 14:35:57,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=48146.666666666664, ans=0.1 2023-09-28 14:35:59,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 14:36:00,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:36:01,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=48146.666666666664, ans=0.125 2023-09-28 14:36:04,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 14:36:06,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 14:36:08,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=48213.333333333336, ans=0.125 2023-09-28 14:36:17,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 14:36:20,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 14:36:20,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:36:20,787 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 14:36:20,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 14:36:22,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 14:36:22,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 14:36:22,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:36:27,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 14:36:30,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:36:35,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:35,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 14:36:37,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:36:37,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=48346.666666666664, ans=0.0 2023-09-28 14:36:42,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 14:36:42,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:47,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:36:47,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:36:47,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:36:49,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:36:49,878 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=48346.666666666664, ans=0.125 2023-09-28 14:36:50,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:36:51,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:36:52,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:36:55,542 INFO [train.py:1039] (0/4) Epoch 2, batch 1950, loss[loss=0.383, simple_loss=0.3957, pruned_loss=0.1851, over 22729.00 frames. ], tot_loss[loss=0.3222, simple_loss=0.3593, pruned_loss=0.1425, over 4722510.31 frames. ], batch size: 322, lr: 3.74e-02, grad_scale: 16.0 2023-09-28 14:36:55,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:36:55,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:36:55,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=48413.333333333336, ans=0.0 2023-09-28 14:36:57,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:36:57,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:58,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:58,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:37:03,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:06,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:37:07,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:07,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:37:09,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 14:37:09,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:37:10,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:12,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:14,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:37:14,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:16,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:17,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:21,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:21,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:37:21,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:37:21,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:21,859 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=48480.0, ans=0.1 2023-09-28 14:37:24,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:29,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:37:29,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:30,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:37:30,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 14:37:31,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:37:31,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:37:31,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:34,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:37,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:37:37,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=48546.666666666664, ans=0.125 2023-09-28 14:37:43,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:37:44,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:37:46,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:37:46,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 14:37:46,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:37:52,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:53,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:37:55,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:38:03,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:04,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:07,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:09,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:12,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:38:13,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:15,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 14:38:16,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:38:16,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:38:17,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 14:38:19,183 INFO [train.py:1039] (0/4) Epoch 2, batch 2000, loss[loss=0.3422, simple_loss=0.3719, pruned_loss=0.1563, over 23507.00 frames. ], tot_loss[loss=0.3224, simple_loss=0.3595, pruned_loss=0.1427, over 4723500.23 frames. ], batch size: 106, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:38:19,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:22,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:38:23,958 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.784e+02 3.169e+02 3.809e+02 6.996e+02, threshold=6.339e+02, percent-clipped=1.0 2023-09-28 14:38:24,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:38:24,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:38:26,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:38:29,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:32,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 14:38:33,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:38:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:38:37,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 14:38:38,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:38:38,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:40,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:38:43,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 14:38:44,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:50,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 14:38:50,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:38:53,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 14:38:53,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:38:57,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:38:57,655 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.20 vs. limit=22.5 2023-09-28 14:38:58,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:38:58,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:58,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:00,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:02,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 14:39:04,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 14:39:06,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:39:06,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:10,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:12,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:39:12,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:12,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:39:13,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:15,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:15,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:15,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:15,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=48946.666666666664, ans=0.0 2023-09-28 14:39:17,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:19,245 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.64 vs. limit=22.5 2023-09-28 14:39:20,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:20,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 14:39:21,184 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.50 vs. limit=15.0 2023-09-28 14:39:25,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:39:27,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:39:36,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:37,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:37,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:39,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:39:40,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:39:42,369 INFO [train.py:1039] (0/4) Epoch 2, batch 2050, loss[loss=0.3061, simple_loss=0.3127, pruned_loss=0.1497, over 19564.00 frames. ], tot_loss[loss=0.3209, simple_loss=0.3582, pruned_loss=0.1418, over 4727300.66 frames. ], batch size: 388, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:39:42,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:44,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:47,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:48,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:52,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:55,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:39:55,838 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.89 vs. limit=6.0 2023-09-28 14:39:56,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:56,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:40:00,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 14:40:00,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:40:00,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:40:00,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:40:12,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:12,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:14,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 14:40:15,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:17,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 14:40:17,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:20,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:22,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=49213.333333333336, ans=10.0 2023-09-28 14:40:22,635 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.41 vs. limit=15.0 2023-09-28 14:40:23,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:25,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:40:25,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:25,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=49213.333333333336, ans=0.125 2023-09-28 14:40:26,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:40:29,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:40:29,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:40:31,504 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=49280.0, ans=0.00015652173913043542 2023-09-28 14:40:33,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:35,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:40:38,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:40:39,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:40:39,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=49280.0, ans=0.125 2023-09-28 14:40:45,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:40:50,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:40:50,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 14:40:56,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:40:57,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:40:57,742 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=12.0 2023-09-28 14:41:00,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:41:01,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 14:41:05,379 INFO [train.py:1039] (0/4) Epoch 2, batch 2100, loss[loss=0.3013, simple_loss=0.3454, pruned_loss=0.1286, over 24302.00 frames. ], tot_loss[loss=0.3178, simple_loss=0.3554, pruned_loss=0.1401, over 4727915.95 frames. ], batch size: 61, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:41:06,919 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 14:41:06,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:07,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:08,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:09,915 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.967e+02 2.956e+02 3.430e+02 4.185e+02 6.974e+02, threshold=6.859e+02, percent-clipped=1.0 2023-09-28 14:41:10,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:41:10,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 14:41:10,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 14:41:12,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:41:14,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=49413.333333333336, ans=0.0 2023-09-28 14:41:16,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:41:17,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:41:20,005 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.11 vs. limit=15.0 2023-09-28 14:41:20,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:20,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:41:20,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 14:41:22,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:41:23,078 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.39 vs. limit=22.5 2023-09-28 14:41:23,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 14:41:23,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 14:41:25,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:26,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:41:26,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 14:41:26,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 14:41:32,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 14:41:32,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:34,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:41:36,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:39,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:41:39,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 14:41:39,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:39,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:41:43,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 14:41:44,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:44,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 14:41:44,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 14:41:44,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 14:41:46,001 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.38 vs. limit=15.0 2023-09-28 14:41:46,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:41:49,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:41:52,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:55,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:56,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:56,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:56,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 14:41:58,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:58,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:58,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:58,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 14:41:59,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 14:42:01,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 14:42:05,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:42:10,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:42:10,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 14:42:17,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:18,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:42:18,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:18,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:42:20,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:42:20,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:42:22,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:42:24,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:42:24,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:26,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 14:42:27,807 INFO [train.py:1039] (0/4) Epoch 2, batch 2150, loss[loss=0.2744, simple_loss=0.2805, pruned_loss=0.1342, over 19233.00 frames. ], tot_loss[loss=0.317, simple_loss=0.3544, pruned_loss=0.1398, over 4716247.26 frames. ], batch size: 389, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:42:27,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 14:42:27,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:30,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:42:30,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:42:31,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:42:31,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:42:37,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:42:38,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:38,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:40,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:42:40,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:40,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:42:45,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:45,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:42:45,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:42:50,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:50,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 14:42:54,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:42:55,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=49813.333333333336, ans=0.2 2023-09-28 14:42:57,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:42:58,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:58,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:42:58,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:00,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:43:00,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:43:00,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 14:43:01,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:43:02,752 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.03 vs. limit=22.5 2023-09-28 14:43:03,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:03,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:05,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:43:06,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:43:09,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:09,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:43:11,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:11,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 14:43:11,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:43:14,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:15,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:17,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:19,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:43:19,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:21,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:21,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 14:43:22,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 14:43:24,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:43:24,101 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 14:43:26,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:27,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:43:27,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 14:43:27,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:43:27,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 14:43:29,675 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 14:43:29,675 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 14:43:30,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=49946.666666666664, ans=0.125 2023-09-28 14:43:31,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 14:43:31,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:31,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=50013.333333333336, ans=0.125 2023-09-28 14:43:32,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:32,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:43:34,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:34,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:43:37,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:37,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:45,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:43:46,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 14:43:47,474 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.13 vs. limit=15.0 2023-09-28 14:43:47,920 INFO [train.py:1039] (0/4) Epoch 2, batch 2200, loss[loss=0.3637, simple_loss=0.3768, pruned_loss=0.1753, over 22857.00 frames. ], tot_loss[loss=0.3167, simple_loss=0.3548, pruned_loss=0.1394, over 4719252.05 frames. ], batch size: 322, lr: 3.71e-02, grad_scale: 32.0 2023-09-28 14:43:51,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:43:52,562 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.160e+02 2.946e+02 3.281e+02 3.928e+02 6.005e+02, threshold=6.562e+02, percent-clipped=0.0 2023-09-28 14:43:56,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:56,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:43:56,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:59,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:44:03,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:03,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:44:03,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 14:44:07,092 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=50146.666666666664, ans=0.125 2023-09-28 14:44:08,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 14:44:09,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:44:16,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 14:44:19,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:19,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:19,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:44:22,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:44:22,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 14:44:27,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:44:29,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:29,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 14:44:29,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=50213.333333333336, ans=0.0 2023-09-28 14:44:34,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:44:34,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:38,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:44:40,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:41,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 14:44:43,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:44,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 14:44:45,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=50280.0, ans=0.0 2023-09-28 14:44:46,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:46,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:44:46,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:48,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:49,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:49,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:49,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:51,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:44:51,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:44:52,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:44:56,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:44:56,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:56,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=50346.666666666664, ans=0.0 2023-09-28 14:44:59,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:45:01,383 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 14:45:03,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:45:04,665 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 14:45:04,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:45:06,270 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 14:45:08,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:08,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:45:09,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:11,955 INFO [train.py:1039] (0/4) Epoch 2, batch 2250, loss[loss=0.3498, simple_loss=0.3761, pruned_loss=0.1618, over 23835.00 frames. ], tot_loss[loss=0.3183, simple_loss=0.3559, pruned_loss=0.1404, over 4718330.89 frames. ], batch size: 179, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:45:12,195 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 14:45:12,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=50413.333333333336, ans=0.1 2023-09-28 14:45:13,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:45:15,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:21,687 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=50413.333333333336, ans=0.0 2023-09-28 14:45:22,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:45:23,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:45:26,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:27,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:29,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:32,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 14:45:32,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:45:32,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:45:35,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 14:45:35,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:45:36,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=50480.0, ans=0.125 2023-09-28 14:45:37,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:38,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:42,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=50546.666666666664, ans=10.0 2023-09-28 14:45:44,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:45:45,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:45:45,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:45:47,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 14:45:49,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:49,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=50546.666666666664, ans=0.2 2023-09-28 14:45:49,867 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.97 vs. limit=15.0 2023-09-28 14:45:50,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:45:52,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=50546.666666666664, ans=0.125 2023-09-28 14:45:57,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:45:58,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:46:00,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:00,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:46:01,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:46:02,083 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=1.580e-02 2023-09-28 14:46:03,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:46:05,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=50613.333333333336, ans=0.125 2023-09-28 14:46:05,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=50613.333333333336, ans=22.5 2023-09-28 14:46:10,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:46:12,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:46:18,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:46:18,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:46:20,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:46:24,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:46:29,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:46:29,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 14:46:30,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:30,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:46:32,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 14:46:33,551 INFO [train.py:1039] (0/4) Epoch 2, batch 2300, loss[loss=0.3077, simple_loss=0.3425, pruned_loss=0.1364, over 23510.00 frames. ], tot_loss[loss=0.3203, simple_loss=0.3573, pruned_loss=0.1417, over 4714499.72 frames. ], batch size: 134, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:46:35,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:46:35,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:38,466 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.122e+02 3.000e+02 3.557e+02 4.160e+02 8.082e+02, threshold=7.115e+02, percent-clipped=2.0 2023-09-28 14:46:40,561 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=50746.666666666664, ans=0.04949747468305833 2023-09-28 14:46:41,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:41,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:46:45,362 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 14:46:46,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:53,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:46:53,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:46:54,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:46:54,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:54,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 14:46:57,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:46:58,844 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=50813.333333333336, ans=0.0 2023-09-28 14:47:00,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:00,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:47:03,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:47:06,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:47:10,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:16,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:47:17,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:47:20,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:47:22,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:47:27,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:27,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:47:27,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:47:27,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 14:47:33,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:47:33,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:33,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:33,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:47:33,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:34,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 14:47:34,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:47:36,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 14:47:36,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:47:36,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:37,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 14:47:42,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=51013.333333333336, ans=0.0 2023-09-28 14:47:44,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:47:47,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:47:52,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:52,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:47:52,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:47:52,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:47:54,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:47:55,607 INFO [train.py:1039] (0/4) Epoch 2, batch 2350, loss[loss=0.2707, simple_loss=0.3218, pruned_loss=0.1098, over 24297.00 frames. ], tot_loss[loss=0.3203, simple_loss=0.3572, pruned_loss=0.1417, over 4705922.80 frames. ], batch size: 56, lr: 3.69e-02, grad_scale: 32.0 2023-09-28 14:47:55,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:47:55,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 14:48:01,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:01,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 14:48:03,469 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=51080.0, ans=0.125 2023-09-28 14:48:08,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 14:48:08,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=51080.0, ans=0.0 2023-09-28 14:48:11,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:48:14,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:15,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:16,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 14:48:19,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:48:24,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 14:48:27,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:29,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:48:29,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:48:32,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:48:32,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 14:48:32,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=51213.333333333336, ans=0.0 2023-09-28 14:48:34,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:48:36,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:36,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:48:37,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:48:42,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:48:44,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 14:48:44,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:47,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:47,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:48:49,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 14:48:50,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:48:52,895 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.16 vs. limit=15.0 2023-09-28 14:48:55,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 14:48:55,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:49:00,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 14:49:05,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 14:49:05,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:49:05,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:49:07,297 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 14:49:07,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 14:49:08,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 14:49:14,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:49:18,675 INFO [train.py:1039] (0/4) Epoch 2, batch 2400, loss[loss=0.3095, simple_loss=0.3472, pruned_loss=0.136, over 23629.00 frames. ], tot_loss[loss=0.3195, simple_loss=0.357, pruned_loss=0.141, over 4717172.57 frames. ], batch size: 149, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:49:18,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:49:23,885 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.787e+02 3.340e+02 4.281e+02 7.222e+02, threshold=6.680e+02, percent-clipped=0.0 2023-09-28 14:49:24,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:49:24,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:49:25,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 14:49:25,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 14:49:33,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:49:33,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:49:36,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 14:49:36,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:49:37,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:37,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 14:49:44,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:46,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 14:49:51,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:49:52,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=51546.666666666664, ans=0.1 2023-09-28 14:49:57,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 14:49:59,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=51546.666666666664, ans=0.125 2023-09-28 14:50:00,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:02,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=51546.666666666664, ans=0.1 2023-09-28 14:50:03,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:06,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:06,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 14:50:08,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:50:17,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:18,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:19,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=51613.333333333336, ans=0.125 2023-09-28 14:50:22,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:24,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:50:24,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:50:24,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:50:24,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:24,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:24,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:50:27,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=51680.0, ans=0.0 2023-09-28 14:50:30,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:50:31,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:50:32,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 14:50:32,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 14:50:35,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:35,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:35,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 14:50:35,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 14:50:37,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 14:50:37,026 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 14:50:37,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 14:50:37,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:37,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=51680.0, ans=0.0 2023-09-28 14:50:38,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:38,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:40,502 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 14:50:41,856 INFO [train.py:1039] (0/4) Epoch 2, batch 2450, loss[loss=0.3265, simple_loss=0.3479, pruned_loss=0.1526, over 23780.00 frames. ], tot_loss[loss=0.3178, simple_loss=0.3546, pruned_loss=0.1405, over 4688428.35 frames. ], batch size: 232, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:50:42,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:44,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:50:47,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:50:47,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:52,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:52,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:53,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 14:50:57,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=51813.333333333336, ans=0.125 2023-09-28 14:50:59,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:59,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:02,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:51:02,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:51:02,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:51:02,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 14:51:07,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:09,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:51:09,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:51:10,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:51:12,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:14,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:14,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:51:14,962 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=51880.0, ans=0.125 2023-09-28 14:51:17,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 14:51:17,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:51:19,442 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:51:29,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:30,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:30,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:32,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:51:32,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:34,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:51:35,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 14:51:36,359 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.48 vs. limit=6.0 2023-09-28 14:51:37,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:37,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:51:42,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:51:42,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:49,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:51:49,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 14:51:49,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:51:51,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:51:51,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 14:51:51,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:51:52,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:51:53,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=52013.333333333336, ans=0.0 2023-09-28 14:51:57,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:51:59,741 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.11 vs. limit=22.5 2023-09-28 14:52:01,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:01,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:52:01,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=52013.333333333336, ans=0.125 2023-09-28 14:52:04,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 14:52:05,485 INFO [train.py:1039] (0/4) Epoch 2, batch 2500, loss[loss=0.3225, simple_loss=0.3604, pruned_loss=0.1423, over 23721.00 frames. ], tot_loss[loss=0.3163, simple_loss=0.3536, pruned_loss=0.1395, over 4684407.30 frames. ], batch size: 232, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:52:05,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:52:10,766 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.999e+02 2.754e+02 3.242e+02 3.766e+02 6.714e+02, threshold=6.484e+02, percent-clipped=2.0 2023-09-28 14:52:12,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:22,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:52:22,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:52:25,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:25,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 14:52:29,470 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:52:32,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:52:33,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:52:35,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:52:35,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:52:36,315 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.35 vs. limit=10.0 2023-09-28 14:52:36,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 14:52:38,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:38,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:39,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=52213.333333333336, ans=0.07 2023-09-28 14:52:40,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 14:52:40,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:40,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=52213.333333333336, ans=0.125 2023-09-28 14:52:41,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 14:52:41,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:52:45,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:47,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:50,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:52:50,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 14:52:50,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:52:52,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:57,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:01,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:05,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:10,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:53:12,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 14:53:12,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:12,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:14,420 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:53:16,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:53:16,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:53:17,951 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 14:53:17,952 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 14:53:17,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 14:53:19,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:22,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 14:53:22,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 14:53:22,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:24,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 14:53:29,230 INFO [train.py:1039] (0/4) Epoch 2, batch 2550, loss[loss=0.3563, simple_loss=0.3746, pruned_loss=0.169, over 23630.00 frames. ], tot_loss[loss=0.317, simple_loss=0.3544, pruned_loss=0.1398, over 4696260.05 frames. ], batch size: 256, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:53:29,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 14:53:29,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=52413.333333333336, ans=0.125 2023-09-28 14:53:32,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:33,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:53:35,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:53:36,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:37,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 14:53:37,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:53:41,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 14:53:42,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:53:46,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:49,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:49,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 14:53:49,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:53:49,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:53:49,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:51,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=52480.0, ans=0.125 2023-09-28 14:53:53,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:53:53,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 14:53:54,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:54,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:54,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 14:53:54,930 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=52480.0, ans=0.0 2023-09-28 14:54:06,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:54:11,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:11,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:11,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:54:13,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:54:21,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:54:24,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:54:24,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:54:24,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:54:25,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:54:25,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:54:31,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:31,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:37,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:54:39,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 14:54:39,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:54:39,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:40,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:54:43,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:54:44,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:52,144 INFO [train.py:1039] (0/4) Epoch 2, batch 2600, loss[loss=0.2907, simple_loss=0.343, pruned_loss=0.1192, over 24360.00 frames. ], tot_loss[loss=0.3172, simple_loss=0.3551, pruned_loss=0.1397, over 4705572.29 frames. ], batch size: 61, lr: 3.66e-02, grad_scale: 32.0 2023-09-28 14:54:52,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:54:53,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:57,271 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.901e+02 3.329e+02 4.085e+02 7.147e+02, threshold=6.657e+02, percent-clipped=2.0 2023-09-28 14:54:57,461 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 14:54:59,165 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 14:54:59,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:54:59,237 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 14:55:00,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 14:55:00,866 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 14:55:02,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:55:02,559 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 14:55:02,768 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=52746.666666666664, ans=0.1 2023-09-28 14:55:06,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 14:55:07,625 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 14:55:11,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:55:12,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 14:55:14,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 14:55:16,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:55:16,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 14:55:19,611 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 14:55:19,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 14:55:25,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:27,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:27,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:27,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 14:55:30,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:55:34,443 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:55:35,691 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 14:55:36,524 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.32 vs. limit=10.0 2023-09-28 14:55:37,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=52880.0, ans=0.1 2023-09-28 14:55:42,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:42,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:42,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 14:55:42,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:55:42,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:44,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 14:55:46,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=52946.666666666664, ans=0.1 2023-09-28 14:55:47,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:55:47,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:55:50,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:53,029 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 14:55:54,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:54,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:55:59,291 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=53013.333333333336, ans=0.125 2023-09-28 14:55:59,306 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=53013.333333333336, ans=0.125 2023-09-28 14:56:02,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:56:02,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:56:04,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 14:56:04,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:56:07,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:07,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:07,503 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=53013.333333333336, ans=0.125 2023-09-28 14:56:13,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 14:56:13,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:15,317 INFO [train.py:1039] (0/4) Epoch 2, batch 2650, loss[loss=0.3064, simple_loss=0.3576, pruned_loss=0.1276, over 24631.00 frames. ], tot_loss[loss=0.3169, simple_loss=0.3548, pruned_loss=0.1395, over 4706314.42 frames. ], batch size: 65, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:56:15,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:56:19,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 14:56:19,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:20,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:56:22,254 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 14:56:22,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:56:25,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:27,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:56:29,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:30,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:56:31,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=53146.666666666664, ans=0.0 2023-09-28 14:56:32,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 14:56:32,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:56:32,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:56:34,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=53146.666666666664, ans=0.5 2023-09-28 14:56:35,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 14:56:38,959 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 14:56:40,732 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=53146.666666666664, ans=0.125 2023-09-28 14:56:42,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:45,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 14:56:45,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:56:47,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 14:56:50,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:50,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:56:50,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:52,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:56:57,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 14:56:57,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 14:56:57,870 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:56:59,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:03,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 14:57:04,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:05,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:06,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:06,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:06,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:08,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:11,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:12,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:57:13,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:57:13,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:57:15,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:16,896 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-8000.pt 2023-09-28 14:57:20,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:57:20,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:21,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:23,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:57:27,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:28,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:57:28,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:30,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 14:57:31,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=53346.666666666664, ans=0.02 2023-09-28 14:57:35,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:36,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:37,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=53346.666666666664, ans=0.09899494936611666 2023-09-28 14:57:37,372 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.32 vs. limit=15.0 2023-09-28 14:57:38,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:38,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:40,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:40,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:42,107 INFO [train.py:1039] (0/4) Epoch 2, batch 2700, loss[loss=0.2917, simple_loss=0.3437, pruned_loss=0.1198, over 24665.00 frames. ], tot_loss[loss=0.3175, simple_loss=0.3561, pruned_loss=0.1394, over 4717254.16 frames. ], batch size: 65, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:57:42,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:57:42,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 14:57:45,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:57:46,769 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.066e+02 2.772e+02 3.228e+02 4.080e+02 7.773e+02, threshold=6.457e+02, percent-clipped=3.0 2023-09-28 14:57:46,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 14:57:47,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=53413.333333333336, ans=0.1 2023-09-28 14:57:48,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:50,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:57:52,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:52,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:57:52,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:57:52,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 14:57:53,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:57:53,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:55,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:57:55,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:57,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=53480.0, ans=0.125 2023-09-28 14:57:59,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:58:00,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 14:58:00,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:06,226 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.13 vs. limit=15.0 2023-09-28 14:58:07,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=53480.0, ans=0.0 2023-09-28 14:58:09,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:58:09,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:12,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=53480.0, ans=0.125 2023-09-28 14:58:14,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:58:14,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:58:14,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:58:14,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:58:19,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:22,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:22,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:58:22,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:58:26,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=53546.666666666664, ans=0.0 2023-09-28 14:58:29,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:29,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:58:38,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:58:38,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:58:41,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:58:42,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:58:47,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:49,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:50,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:50,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:58:52,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:52,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:58:52,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=53680.0, ans=0.1 2023-09-28 14:58:56,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:58,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:58,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:59:00,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 14:59:02,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:03,497 INFO [train.py:1039] (0/4) Epoch 2, batch 2750, loss[loss=0.3261, simple_loss=0.3702, pruned_loss=0.141, over 24096.00 frames. ], tot_loss[loss=0.3165, simple_loss=0.3552, pruned_loss=0.1389, over 4729405.14 frames. ], batch size: 86, lr: 3.64e-02, grad_scale: 16.0 2023-09-28 14:59:05,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:59:05,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 14:59:07,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 14:59:07,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:08,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:08,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:11,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:12,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:59:12,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:17,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:18,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:59:18,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:59:18,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:18,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 14:59:18,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:59:18,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:25,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 14:59:27,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:59:27,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:28,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:59:29,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:59:30,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:31,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:59:31,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:33,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:38,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:59:38,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:59:38,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:59:40,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:42,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:59:48,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:49,451 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.05 vs. limit=6.0 2023-09-28 14:59:50,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:59:50,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:51,025 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.90 vs. limit=10.0 2023-09-28 14:59:55,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:55,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:59:57,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:59:57,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=53946.666666666664, ans=0.0 2023-09-28 15:00:02,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:00:02,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:00:02,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 15:00:06,631 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.99 vs. limit=15.0 2023-09-28 15:00:08,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:10,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 15:00:16,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:00:18,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:00:18,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 15:00:20,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:00:23,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:00:23,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 15:00:24,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:00:26,422 INFO [train.py:1039] (0/4) Epoch 2, batch 2800, loss[loss=0.3291, simple_loss=0.379, pruned_loss=0.1396, over 23571.00 frames. ], tot_loss[loss=0.3155, simple_loss=0.3545, pruned_loss=0.1383, over 4727907.60 frames. ], batch size: 85, lr: 3.64e-02, grad_scale: 32.0 2023-09-28 15:00:28,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:00:28,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:28,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:00:29,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 15:00:29,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:29,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:32,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:32,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=54080.0, ans=0.125 2023-09-28 15:00:33,358 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.006e+02 2.948e+02 3.600e+02 4.282e+02 6.554e+02, threshold=7.201e+02, percent-clipped=1.0 2023-09-28 15:00:33,507 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 15:00:33,508 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 15:00:36,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:38,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:00:38,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:00:41,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:00:43,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 15:00:47,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:00:48,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 15:00:50,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:50,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:00:50,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:00:50,702 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=54146.666666666664, ans=0.125 2023-09-28 15:00:51,226 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.24 vs. limit=15.0 2023-09-28 15:00:53,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:00:53,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:53,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:00:56,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:03,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:01:05,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:10,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:10,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:01:12,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:15,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:15,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 15:01:16,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:18,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:18,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:01:22,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:23,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:26,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:28,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:01:28,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:28,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:01:28,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:01:28,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:01:30,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:01:30,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 15:01:32,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:32,323 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:01:33,567 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=54346.666666666664, ans=0.125 2023-09-28 15:01:34,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:34,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:35,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 15:01:35,802 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=18.06 vs. limit=15.0 2023-09-28 15:01:36,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:36,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:01:38,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:01:39,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 15:01:46,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:46,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:01:46,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:01:49,652 INFO [train.py:1039] (0/4) Epoch 2, batch 2850, loss[loss=0.3264, simple_loss=0.3752, pruned_loss=0.1388, over 24655.00 frames. ], tot_loss[loss=0.3134, simple_loss=0.3528, pruned_loss=0.137, over 4732640.23 frames. ], batch size: 73, lr: 3.63e-02, grad_scale: 32.0 2023-09-28 15:01:49,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:01:56,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:01:56,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:56,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:56,778 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=54413.333333333336, ans=0.1 2023-09-28 15:01:56,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=54413.333333333336, ans=0.125 2023-09-28 15:02:01,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:01,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:02:02,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:02:02,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 15:02:03,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=54413.333333333336, ans=0.125 2023-09-28 15:02:09,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 15:02:09,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:09,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=54480.0, ans=0.0 2023-09-28 15:02:12,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 15:02:13,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:15,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 15:02:17,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 15:02:17,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:19,694 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=54480.0, ans=0.0 2023-09-28 15:02:31,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:32,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:32,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:02:34,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:02:34,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:02:34,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:02:36,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:02:36,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 15:02:39,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:02:39,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:02:39,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:41,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:44,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:44,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:45,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:48,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:50,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:02:52,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:53,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:56,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:03:02,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:03:04,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 15:03:04,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 15:03:06,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:03:08,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:08,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 15:03:09,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:03:09,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:10,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:10,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:03:10,916 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 15:03:10,970 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 15:03:10,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:12,361 INFO [train.py:1039] (0/4) Epoch 2, batch 2900, loss[loss=0.318, simple_loss=0.3714, pruned_loss=0.1323, over 24545.00 frames. ], tot_loss[loss=0.3137, simple_loss=0.353, pruned_loss=0.1372, over 4723091.64 frames. ], batch size: 71, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:03:12,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:15,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:15,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:18,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:03:19,517 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.086e+02 2.913e+02 3.691e+02 4.538e+02 7.186e+02, threshold=7.382e+02, percent-clipped=0.0 2023-09-28 15:03:19,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 15:03:22,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:22,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 15:03:24,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 15:03:26,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:03:26,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:03:29,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:30,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:03:31,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=54813.333333333336, ans=0.125 2023-09-28 15:03:34,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:34,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:34,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=54813.333333333336, ans=0.0 2023-09-28 15:03:39,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:03:39,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 15:03:39,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:03:42,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:44,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 15:03:44,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 15:03:47,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:47,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 15:03:47,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:03:50,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:03:50,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:55,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:57,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:04:00,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:04:01,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:05,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 15:04:06,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 15:04:06,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:04:09,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:04:12,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 15:04:12,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=54946.666666666664, ans=0.1 2023-09-28 15:04:14,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:04:20,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:04:20,443 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=55013.333333333336, ans=0.125 2023-09-28 15:04:29,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:04:29,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:04:31,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 15:04:34,442 INFO [train.py:1039] (0/4) Epoch 2, batch 2950, loss[loss=0.3079, simple_loss=0.3649, pruned_loss=0.1254, over 24627.00 frames. ], tot_loss[loss=0.3134, simple_loss=0.3532, pruned_loss=0.1368, over 4725236.24 frames. ], batch size: 68, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:04:34,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:34,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 15:04:34,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:36,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:04:41,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:42,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 15:04:44,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:44,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:46,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:04:46,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:04:48,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 15:04:49,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 15:04:50,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:04:50,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:58,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:04:59,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:01,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:03,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:06,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:07,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:05:08,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:05:08,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=55213.333333333336, ans=0.125 2023-09-28 15:05:11,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 15:05:16,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 15:05:16,837 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 15:05:18,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:05:20,649 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 15:05:23,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 15:05:23,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:24,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:05:24,498 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 15:05:24,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:05:27,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 15:05:27,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:27,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:05:27,993 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=55280.0, ans=0.1 2023-09-28 15:05:30,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:32,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:05:33,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:33,851 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 15:05:33,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:34,158 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:05:35,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 15:05:35,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=55280.0, ans=0.1 2023-09-28 15:05:40,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:42,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:05:42,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 15:05:42,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:05:45,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 15:05:46,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:48,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:50,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:05:50,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:52,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:05:54,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:05:54,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:54,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:05:54,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:05:56,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:56,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:05:57,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:57,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 15:05:59,281 INFO [train.py:1039] (0/4) Epoch 2, batch 3000, loss[loss=0.4449, simple_loss=0.4279, pruned_loss=0.2309, over 19268.00 frames. ], tot_loss[loss=0.3144, simple_loss=0.3539, pruned_loss=0.1374, over 4723558.79 frames. ], batch size: 389, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:05:59,282 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 15:06:14,692 INFO [train.py:1071] (0/4) Epoch 2, validation: loss=0.3279, simple_loss=0.3383, pruned_loss=0.1588, over 1125622.00 frames. 2023-09-28 15:06:14,693 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 15:06:14,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:06:17,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:06:17,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:06:20,856 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.946e+02 3.548e+02 4.220e+02 7.965e+02, threshold=7.096e+02, percent-clipped=1.0 2023-09-28 15:06:20,952 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 15:06:21,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 15:06:23,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:06:23,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:06:24,280 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.74 vs. limit=10.0 2023-09-28 15:06:25,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 15:06:25,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:33,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:06:43,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:06:44,472 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.29 vs. limit=10.0 2023-09-28 15:06:47,975 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=55546.666666666664, ans=0.0 2023-09-28 15:06:49,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=55546.666666666664, ans=0.125 2023-09-28 15:06:50,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 15:06:50,925 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=55546.666666666664, ans=0.0 2023-09-28 15:06:52,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:06:55,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:06:55,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:55,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:06:58,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:06:58,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 15:07:01,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 15:07:02,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:07:02,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:07:04,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:07:04,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:06,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:06,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:07:09,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:07:09,688 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.39 vs. limit=22.5 2023-09-28 15:07:10,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:07:10,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:07:13,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:16,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 15:07:16,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:07:16,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:17,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:07:21,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=55680.0, ans=0.0 2023-09-28 15:07:23,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:23,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:24,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:07:24,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 15:07:24,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:07:24,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 15:07:26,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:07:28,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 15:07:31,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:07:31,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=55680.0, ans=0.125 2023-09-28 15:07:34,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:07:34,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 15:07:35,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 15:07:35,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:07:36,402 INFO [train.py:1039] (0/4) Epoch 2, batch 3050, loss[loss=0.3452, simple_loss=0.3723, pruned_loss=0.1591, over 23332.00 frames. ], tot_loss[loss=0.3141, simple_loss=0.3542, pruned_loss=0.137, over 4737323.11 frames. ], batch size: 105, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:07:37,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:07:39,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:39,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:07:39,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:39,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:07:41,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 15:07:42,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:07:45,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:07:46,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:07:51,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:54,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 15:07:59,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=55813.333333333336, ans=0.125 2023-09-28 15:08:00,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 15:08:02,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 15:08:02,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:04,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:08:07,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:08,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:10,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:13,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:13,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:08:15,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:15,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:15,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:16,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:17,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=55880.0, ans=0.125 2023-09-28 15:08:18,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:21,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:21,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 15:08:21,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:21,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:08:21,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=55880.0, ans=0.2 2023-09-28 15:08:24,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:08:26,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:08:27,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:08:27,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:31,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:33,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:40,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:40,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:08:40,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:43,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:44,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:08:44,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:46,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 15:08:50,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:50,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:51,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 15:08:53,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:57,737 INFO [train.py:1039] (0/4) Epoch 2, batch 3100, loss[loss=0.3292, simple_loss=0.3489, pruned_loss=0.1547, over 23695.00 frames. ], tot_loss[loss=0.3134, simple_loss=0.3536, pruned_loss=0.1366, over 4736536.30 frames. ], batch size: 212, lr: 3.60e-02, grad_scale: 32.0 2023-09-28 15:08:59,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:00,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:09:03,852 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.748e+02 3.065e+02 3.838e+02 6.915e+02, threshold=6.130e+02, percent-clipped=0.0 2023-09-28 15:09:03,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:09:06,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 15:09:09,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 15:09:11,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 15:09:11,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:09:14,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:09:14,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:15,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=56146.666666666664, ans=0.1 2023-09-28 15:09:16,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:09:21,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:27,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 15:09:30,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:09:30,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:31,504 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.29 vs. limit=15.0 2023-09-28 15:09:32,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:09:32,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:09:33,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:09:35,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:09:35,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 15:09:35,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:09:38,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:38,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 15:09:40,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:09:42,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:09:44,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 15:09:44,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 15:09:46,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:47,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:49,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:09:49,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:49,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:09:52,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:09:52,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:56,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:09:56,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:09:56,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:56,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:10:02,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:10:04,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 15:10:05,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:10:07,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 15:10:07,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:07,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:07,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 15:10:07,607 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=56346.666666666664, ans=0.0 2023-09-28 15:10:16,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 15:10:19,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:19,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:20,596 INFO [train.py:1039] (0/4) Epoch 2, batch 3150, loss[loss=0.322, simple_loss=0.3485, pruned_loss=0.1477, over 23826.00 frames. ], tot_loss[loss=0.3128, simple_loss=0.3521, pruned_loss=0.1368, over 4712071.20 frames. ], batch size: 212, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:10:22,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:10:22,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:10:25,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 15:10:25,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:27,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:10:29,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 15:10:29,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:30,770 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 15:10:35,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 15:10:35,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:10:36,709 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 15:10:38,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:10:39,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 15:10:41,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 15:10:41,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 15:10:41,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:41,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:10:41,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=56480.0, ans=0.0 2023-09-28 15:10:42,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:43,575 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.52 vs. limit=22.5 2023-09-28 15:10:44,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 15:10:44,783 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=56480.0, ans=0.1 2023-09-28 15:10:45,160 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.57 vs. limit=15.0 2023-09-28 15:10:46,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:46,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:48,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:10:50,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:10:53,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 15:10:55,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:10:58,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:10:59,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:10:59,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 15:11:03,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 15:11:05,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:11:05,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:11:05,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:11:05,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:05,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:11:08,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:11:08,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:11:08,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 15:11:10,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:11:10,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:11,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:11:11,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:11:12,222 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=12.0 2023-09-28 15:11:13,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 15:11:13,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:13,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=56613.333333333336, ans=0.125 2023-09-28 15:11:15,090 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:11:16,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 15:11:16,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:16,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 15:11:18,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 15:11:18,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:11:20,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:22,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 15:11:22,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 15:11:24,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:24,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.16 vs. limit=22.5 2023-09-28 15:11:27,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:11:28,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:28,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:11:33,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:11:36,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:40,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 15:11:43,299 INFO [train.py:1039] (0/4) Epoch 2, batch 3200, loss[loss=0.316, simple_loss=0.368, pruned_loss=0.132, over 24563.00 frames. ], tot_loss[loss=0.3115, simple_loss=0.3516, pruned_loss=0.1357, over 4720832.86 frames. ], batch size: 71, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:11:45,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:11:45,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 15:11:49,713 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 2.897e+02 3.504e+02 4.245e+02 7.793e+02, threshold=7.007e+02, percent-clipped=2.0 2023-09-28 15:11:49,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:50,206 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=56746.666666666664, ans=0.0 2023-09-28 15:11:51,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:11:51,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 15:11:54,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:56,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=56746.666666666664, ans=0.0 2023-09-28 15:12:01,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:12:04,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:12:13,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:12:24,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 15:12:25,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:12:28,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 15:12:29,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:12:34,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:12:34,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:12:34,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=56946.666666666664, ans=0.1 2023-09-28 15:12:36,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:12:41,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 15:12:42,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:12:44,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 15:12:47,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 15:12:49,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:12:54,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:56,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:12:56,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:57,726 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 15:12:57,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:13:00,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:01,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 15:13:03,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 15:13:03,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 15:13:03,444 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=57013.333333333336, ans=0.125 2023-09-28 15:13:04,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 15:13:06,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:13:07,633 INFO [train.py:1039] (0/4) Epoch 2, batch 3250, loss[loss=0.316, simple_loss=0.3474, pruned_loss=0.1423, over 23592.00 frames. ], tot_loss[loss=0.3103, simple_loss=0.351, pruned_loss=0.1348, over 4729181.25 frames. ], batch size: 120, lr: 3.58e-02, grad_scale: 32.0 2023-09-28 15:13:10,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:13:10,103 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 15:13:10,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:10,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:11,627 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 15:13:14,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:13:18,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:26,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:13:26,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 15:13:28,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:28,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:13:28,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:29,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:29,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:13:34,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:34,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:13:34,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:36,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:13:40,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:42,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:42,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:44,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:46,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:46,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:46,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:13:49,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 15:13:49,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=57213.333333333336, ans=0.0 2023-09-28 15:13:50,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:50,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:13:53,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:53,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:13:59,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:14:09,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:11,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:11,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 15:14:11,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:14:11,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:14:11,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:15,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 15:14:16,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 15:14:16,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:14:17,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:19,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:19,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:14:20,183 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.60 vs. limit=15.0 2023-09-28 15:14:21,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:24,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:24,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:27,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 15:14:27,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:31,066 INFO [train.py:1039] (0/4) Epoch 2, batch 3300, loss[loss=0.3093, simple_loss=0.3664, pruned_loss=0.1261, over 24654.00 frames. ], tot_loss[loss=0.3113, simple_loss=0.352, pruned_loss=0.1353, over 4729768.47 frames. ], batch size: 73, lr: 3.58e-02, grad_scale: 16.0 2023-09-28 15:14:31,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:14:31,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 15:14:34,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:34,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 15:14:35,403 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.78 vs. limit=22.5 2023-09-28 15:14:36,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 15:14:37,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 15:14:37,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:38,873 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.772e+02 3.522e+02 4.271e+02 9.362e+02, threshold=7.044e+02, percent-clipped=2.0 2023-09-28 15:14:41,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:42,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:14:44,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:46,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:14:46,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:14:49,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:51,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:54,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 15:14:54,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:14:54,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:56,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:56,565 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 15:14:58,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:14:58,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:14:59,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:14:59,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:14:59,652 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 15:15:02,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:02,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:15:03,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=57546.666666666664, ans=0.125 2023-09-28 15:15:05,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:05,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 15:15:06,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 15:15:06,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:08,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:15:09,906 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 15:15:12,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 15:15:14,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:15:16,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 15:15:20,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:22,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:15:22,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:15:25,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:25,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:25,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:26,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:15:29,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:15:29,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:29,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:15:29,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=57613.333333333336, ans=0.1 2023-09-28 15:15:30,834 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 15:15:31,092 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=57613.333333333336, ans=0.0 2023-09-28 15:15:32,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 15:15:34,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:15:35,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:15:35,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:38,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:38,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:40,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:15:40,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:42,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:15:42,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:43,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=57680.0, ans=0.2 2023-09-28 15:15:45,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:15:48,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 15:15:48,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:49,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:53,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:15:53,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:53,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=57746.666666666664, ans=0.125 2023-09-28 15:15:54,721 INFO [train.py:1039] (0/4) Epoch 2, batch 3350, loss[loss=0.2779, simple_loss=0.3255, pruned_loss=0.1152, over 24640.00 frames. ], tot_loss[loss=0.3117, simple_loss=0.3526, pruned_loss=0.1354, over 4733934.82 frames. ], batch size: 60, lr: 3.57e-02, grad_scale: 16.0 2023-09-28 15:15:54,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:56,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:56,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:59,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:16:01,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:03,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:16:06,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:07,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:16:09,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:09,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:16:11,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 15:16:13,393 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 15:16:13,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:15,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 15:16:15,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 15:16:16,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:16:18,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:16:18,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:19,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 15:16:19,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:19,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:16:21,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:21,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=57813.333333333336, ans=0.1 2023-09-28 15:16:24,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:25,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:28,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:16:31,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:34,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:34,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:39,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:16:40,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:41,442 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.35 vs. limit=15.0 2023-09-28 15:16:43,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:43,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:46,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:48,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 15:16:48,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:16:48,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 15:16:48,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:16:50,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 15:16:51,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:53,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:53,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=57946.666666666664, ans=0.125 2023-09-28 15:17:00,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:01,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 15:17:03,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:04,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:17:06,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:17:06,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=58013.333333333336, ans=0.04949747468305833 2023-09-28 15:17:12,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:12,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 15:17:12,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:17:12,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:17:14,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:15,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 15:17:16,505 INFO [train.py:1039] (0/4) Epoch 2, batch 3400, loss[loss=0.3662, simple_loss=0.381, pruned_loss=0.1757, over 22722.00 frames. ], tot_loss[loss=0.3142, simple_loss=0.3545, pruned_loss=0.1369, over 4729270.55 frames. ], batch size: 322, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:17:16,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:16,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 15:17:18,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:18,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:19,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:17:21,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:17:21,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 15:17:24,561 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.046e+02 2.787e+02 3.091e+02 3.869e+02 5.571e+02, threshold=6.183e+02, percent-clipped=0.0 2023-09-28 15:17:26,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 15:17:26,268 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 15:17:26,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:17:28,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=58080.0, ans=0.125 2023-09-28 15:17:30,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:30,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:31,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:32,751 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.74 vs. limit=15.0 2023-09-28 15:17:33,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:17:37,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=58146.666666666664, ans=0.125 2023-09-28 15:17:38,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:17:41,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 15:17:41,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=58146.666666666664, ans=0.0 2023-09-28 15:17:47,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:17:50,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:50,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:51,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:17:59,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:18:03,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 15:18:05,564 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=58280.0, ans=0.125 2023-09-28 15:18:10,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 15:18:10,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:12,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:14,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:18:14,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:18:17,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:18:20,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:18:20,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:18:22,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=58346.666666666664, ans=0.125 2023-09-28 15:18:25,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:25,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 15:18:29,465 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=58346.666666666664, ans=0.125 2023-09-28 15:18:32,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:18:37,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 15:18:38,596 INFO [train.py:1039] (0/4) Epoch 2, batch 3450, loss[loss=0.3007, simple_loss=0.3546, pruned_loss=0.1234, over 24568.00 frames. ], tot_loss[loss=0.3147, simple_loss=0.3545, pruned_loss=0.1375, over 4712907.42 frames. ], batch size: 71, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:18:41,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 15:18:41,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:43,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:18:43,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 15:18:46,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:49,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:18:55,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:18:55,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:18:57,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:18:57,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:59,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:06,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 15:19:10,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 15:19:10,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:19:10,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:19:13,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:14,455 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.56 vs. limit=15.0 2023-09-28 15:19:17,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=58546.666666666664, ans=10.0 2023-09-28 15:19:20,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 15:19:21,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:19:25,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:25,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:19:27,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:19:28,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:19:30,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 15:19:30,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:19:32,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:19:37,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 15:19:39,902 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.69 vs. limit=22.5 2023-09-28 15:19:42,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:19:47,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:19:47,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:52,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:19:55,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=58680.0, ans=0.04949747468305833 2023-09-28 15:19:57,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:57,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:57,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:19:58,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:20:01,739 INFO [train.py:1039] (0/4) Epoch 2, batch 3500, loss[loss=0.3119, simple_loss=0.3628, pruned_loss=0.1305, over 23992.00 frames. ], tot_loss[loss=0.312, simple_loss=0.3518, pruned_loss=0.1361, over 4704251.84 frames. ], batch size: 80, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:20:01,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:06,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:20:06,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 15:20:08,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:20:09,931 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.839e+02 3.369e+02 4.173e+02 9.194e+02, threshold=6.738e+02, percent-clipped=6.0 2023-09-28 15:20:11,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:20:13,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:13,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 15:20:18,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:20:20,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:20:22,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:20:22,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:24,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:20:24,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:24,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:25,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 15:20:30,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:32,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:20:33,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:35,661 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=58880.0, ans=0.125 2023-09-28 15:20:38,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:38,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 15:20:38,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:43,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:45,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:20:46,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:48,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:20:48,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:52,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 15:20:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 15:20:52,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 15:20:53,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:55,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:56,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:57,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:21:01,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:21:01,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:21:06,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:07,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 15:21:07,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 15:21:07,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:11,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:12,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:14,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:16,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 15:21:16,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:18,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=59013.333333333336, ans=0.125 2023-09-28 15:21:19,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:21:19,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 15:21:22,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 15:21:23,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=59080.0, ans=0.2 2023-09-28 15:21:24,867 INFO [train.py:1039] (0/4) Epoch 2, batch 3550, loss[loss=0.2722, simple_loss=0.284, pruned_loss=0.1302, over 19068.00 frames. ], tot_loss[loss=0.3092, simple_loss=0.3493, pruned_loss=0.1345, over 4700263.22 frames. ], batch size: 389, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:21:25,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:28,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:28,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:28,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:31,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:21:41,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:43,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 15:21:43,976 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.06 vs. limit=15.0 2023-09-28 15:21:46,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:47,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:21:49,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:51,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:21:51,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:21:54,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:54,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:21:55,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:55,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:21:57,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:22:02,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:22:02,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:22:05,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:05,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:05,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:22:05,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 15:22:05,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:07,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:08,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:22:09,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=59213.333333333336, ans=0.125 2023-09-28 15:22:14,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:14,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:22:16,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:17,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 15:22:18,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:22:19,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 15:22:19,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:19,834 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=59280.0, ans=0.125 2023-09-28 15:22:21,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:22:21,945 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:22:23,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:22:25,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 15:22:26,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:33,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:33,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 15:22:34,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:36,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=59346.666666666664, ans=0.125 2023-09-28 15:22:38,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:39,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 15:22:48,090 INFO [train.py:1039] (0/4) Epoch 2, batch 3600, loss[loss=0.2817, simple_loss=0.3408, pruned_loss=0.1113, over 24495.00 frames. ], tot_loss[loss=0.3082, simple_loss=0.3487, pruned_loss=0.1339, over 4711498.67 frames. ], batch size: 63, lr: 3.54e-02, grad_scale: 32.0 2023-09-28 15:22:48,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 15:22:48,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:22:49,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:22:51,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:51,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:53,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:22:56,190 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.598e+02 2.903e+02 3.548e+02 6.359e+02, threshold=5.806e+02, percent-clipped=0.0 2023-09-28 15:22:57,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:59,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:00,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:23:01,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:23:02,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:02,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 15:23:08,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:23:09,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:12,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:13,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:15,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:23:15,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:23:15,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 15:23:16,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:19,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:20,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:23:22,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=59546.666666666664, ans=0.125 2023-09-28 15:23:23,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:25,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:25,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:23:26,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 15:23:35,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:23:36,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:23:36,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 15:23:39,235 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.49 vs. limit=10.0 2023-09-28 15:23:43,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:23:48,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:51,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:58,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:23:58,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:23:58,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 15:24:00,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 15:24:01,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 15:24:05,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:24:05,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:24:05,586 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=59680.0, ans=0.0 2023-09-28 15:24:06,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 15:24:06,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:08,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:24:08,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:08,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 15:24:09,700 INFO [train.py:1039] (0/4) Epoch 2, batch 3650, loss[loss=0.2973, simple_loss=0.3353, pruned_loss=0.1297, over 23579.00 frames. ], tot_loss[loss=0.3086, simple_loss=0.3491, pruned_loss=0.134, over 4715915.37 frames. ], batch size: 256, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:24:09,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 15:24:14,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:24:14,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 15:24:19,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 15:24:21,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:24:24,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 15:24:26,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 15:24:31,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:24:31,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:24:33,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:24:36,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:24:36,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:36,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 15:24:38,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:24:39,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:39,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 15:24:40,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:24:41,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:24:41,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:43,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:24:46,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 15:24:47,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 15:24:47,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:24:49,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 15:24:50,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:24:51,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:24:57,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:24:59,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:59,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:25:02,307 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.93 vs. limit=15.0 2023-09-28 15:25:02,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:25:04,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:25:08,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:25:09,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:11,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:11,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:25:15,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:25:16,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:25:16,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:24,096 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 15:25:27,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:27,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:27,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:25:29,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:31,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:25:32,597 INFO [train.py:1039] (0/4) Epoch 2, batch 3700, loss[loss=0.3291, simple_loss=0.383, pruned_loss=0.1376, over 24569.00 frames. ], tot_loss[loss=0.3103, simple_loss=0.3507, pruned_loss=0.135, over 4720346.87 frames. ], batch size: 71, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:25:32,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:32,870 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=60080.0, ans=0.0 2023-09-28 15:25:34,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 15:25:34,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:36,649 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.53 vs. limit=22.5 2023-09-28 15:25:37,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:25:39,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:39,955 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=60080.0, ans=0.125 2023-09-28 15:25:41,574 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.121e+02 2.788e+02 3.403e+02 4.126e+02 8.216e+02, threshold=6.806e+02, percent-clipped=7.0 2023-09-28 15:25:41,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:25:43,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:43,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 15:25:43,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:43,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:25:45,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:25:46,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:25:50,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:51,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:51,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:25:53,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:53,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:25:55,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:58,192 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 15:26:06,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:26:08,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:26:09,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:26:09,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 15:26:09,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:14,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:14,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 15:26:17,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:18,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:26:23,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:23,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:26:24,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:26:29,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:29,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 15:26:31,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:26:31,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 15:26:36,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:26:37,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:26:41,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:41,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 15:26:44,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:26:44,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:26:44,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:44,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:45,222 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.00 vs. limit=22.5 2023-09-28 15:26:47,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:48,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 15:26:49,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 15:26:50,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:26:50,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:26:52,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:26:54,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:26:55,337 INFO [train.py:1039] (0/4) Epoch 2, batch 3750, loss[loss=0.2752, simple_loss=0.3302, pruned_loss=0.1101, over 24463.00 frames. ], tot_loss[loss=0.3114, simple_loss=0.3517, pruned_loss=0.1355, over 4713751.43 frames. ], batch size: 66, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:26:57,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:58,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:26:58,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=60413.333333333336, ans=0.125 2023-09-28 15:27:00,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:02,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 15:27:03,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:27:06,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:27:06,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 15:27:08,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:27:08,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:09,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:10,277 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=60480.0, ans=0.07 2023-09-28 15:27:11,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:27:15,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:17,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:27:20,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:27:24,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:27:27,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:28,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 15:27:30,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:31,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:31,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:34,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 15:27:39,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 15:27:41,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:41,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:43,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:48,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:50,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:27:53,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 15:27:57,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:00,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:28:00,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:28:03,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:28:08,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:28:08,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=60680.0, ans=0.1 2023-09-28 15:28:10,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:28:13,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:28:15,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:28:15,971 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.59 vs. limit=15.0 2023-09-28 15:28:18,199 INFO [train.py:1039] (0/4) Epoch 2, batch 3800, loss[loss=0.311, simple_loss=0.3323, pruned_loss=0.1448, over 23436.00 frames. ], tot_loss[loss=0.313, simple_loss=0.3523, pruned_loss=0.1369, over 4713508.03 frames. ], batch size: 285, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:28:18,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:28:24,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.whiten.whitening_limit, batch_count=60746.666666666664, ans=12.0 2023-09-28 15:28:25,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:28:26,487 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.661e+02 3.070e+02 3.841e+02 5.617e+02, threshold=6.140e+02, percent-clipped=0.0 2023-09-28 15:28:30,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:30,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:28:32,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 15:28:35,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:36,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:38,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:28:40,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:28:40,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:40,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:28:41,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:41,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:28:43,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:44,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 15:28:45,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=60813.333333333336, ans=0.125 2023-09-28 15:28:48,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=60813.333333333336, ans=0.125 2023-09-28 15:28:49,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 15:28:49,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:28:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:54,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:28:54,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:28:56,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:28:56,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:58,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:59,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:29:05,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:29:06,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 15:29:08,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:13,369 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=60946.666666666664, ans=0.1 2023-09-28 15:29:16,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:17,526 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.48 vs. limit=6.0 2023-09-28 15:29:20,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=60946.666666666664, ans=0.1 2023-09-28 15:29:22,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:29:24,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 15:29:25,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 15:29:27,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:29:29,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:29,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:32,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 15:29:34,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 15:29:34,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 15:29:34,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:36,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:41,183 INFO [train.py:1039] (0/4) Epoch 2, batch 3850, loss[loss=0.3448, simple_loss=0.3846, pruned_loss=0.1525, over 24384.00 frames. ], tot_loss[loss=0.3116, simple_loss=0.3515, pruned_loss=0.1358, over 4716278.86 frames. ], batch size: 77, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:29:42,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:29:42,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:29:43,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=61080.0, ans=0.0 2023-09-28 15:29:49,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:29:49,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 15:29:51,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:29:52,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:53,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=61080.0, ans=0.0 2023-09-28 15:29:55,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:29:57,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:01,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:30:02,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 15:30:09,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:10,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:30:14,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:14,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:30:17,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:19,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:30:19,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:19,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:30:21,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:23,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:24,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:24,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:30:24,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 15:30:24,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 15:30:24,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:25,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:28,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:28,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:29,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 15:30:31,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 15:30:34,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 15:30:39,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:30:44,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:45,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:50,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:50,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 15:30:54,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 15:30:56,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:57,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:59,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:31:00,144 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.72 vs. limit=15.0 2023-09-28 15:31:00,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:31:02,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:31:02,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 15:31:03,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:31:03,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 15:31:05,413 INFO [train.py:1039] (0/4) Epoch 2, batch 3900, loss[loss=0.292, simple_loss=0.3311, pruned_loss=0.1264, over 23897.00 frames. ], tot_loss[loss=0.3099, simple_loss=0.3493, pruned_loss=0.1352, over 4709040.54 frames. ], batch size: 195, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:31:05,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:05,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:09,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:31:09,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:10,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:31:10,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=61413.333333333336, ans=0.95 2023-09-28 15:31:12,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:12,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:31:13,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:13,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 15:31:14,947 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.111e+02 3.017e+02 3.758e+02 4.866e+02 8.103e+02, threshold=7.517e+02, percent-clipped=9.0 2023-09-28 15:31:15,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:19,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:19,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:19,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:31:21,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:23,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:23,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:25,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:31:25,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 15:31:25,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:25,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=61480.0, ans=0.0 2023-09-28 15:31:29,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 15:31:29,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:30,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 15:31:32,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 15:31:37,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:37,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:37,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:31:37,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:31:44,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:45,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:31:47,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:31:47,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:31:48,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:31:54,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:55,237 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=61613.333333333336, ans=0.125 2023-09-28 15:31:56,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:32:03,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:32:06,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:32:10,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=61680.0, ans=0.125 2023-09-28 15:32:15,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:18,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:18,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 15:32:18,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 15:32:18,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:21,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 15:32:22,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:32:23,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 15:32:27,670 INFO [train.py:1039] (0/4) Epoch 2, batch 3950, loss[loss=0.2895, simple_loss=0.3391, pruned_loss=0.1199, over 24692.00 frames. ], tot_loss[loss=0.3086, simple_loss=0.3488, pruned_loss=0.1342, over 4720313.48 frames. ], batch size: 65, lr: 3.50e-02, grad_scale: 16.0 2023-09-28 15:32:28,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=61746.666666666664, ans=0.2 2023-09-28 15:32:30,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:32:32,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 15:32:32,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:32:33,622 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.40 vs. limit=15.0 2023-09-28 15:32:35,237 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=4.53 vs. limit=12.0 2023-09-28 15:32:35,312 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.47 vs. limit=12.0 2023-09-28 15:32:36,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:32:36,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:32:42,764 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 15:32:42,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:42,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 15:32:44,363 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 15:32:44,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:47,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:47,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:32:47,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:51,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 15:32:54,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:32:54,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:54,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:32:55,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:32:55,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:32:59,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=61880.0, ans=0.2 2023-09-28 15:33:09,404 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.62 vs. limit=10.0 2023-09-28 15:33:10,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:33:10,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:33:15,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 15:33:16,834 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=61946.666666666664, ans=0.125 2023-09-28 15:33:21,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 15:33:21,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 15:33:21,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:33:21,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:33:31,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:33:31,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:33:31,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:33:31,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:33:32,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 15:33:36,245 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=62013.333333333336, ans=0.125 2023-09-28 15:33:37,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:33:37,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:33:43,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 15:33:45,750 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.68 vs. limit=15.0 2023-09-28 15:33:50,908 INFO [train.py:1039] (0/4) Epoch 2, batch 4000, loss[loss=0.3227, simple_loss=0.3543, pruned_loss=0.1456, over 23707.00 frames. ], tot_loss[loss=0.3096, simple_loss=0.35, pruned_loss=0.1346, over 4718008.35 frames. ], batch size: 164, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:33:53,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:00,520 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.115e+02 2.667e+02 3.102e+02 3.739e+02 5.797e+02, threshold=6.204e+02, percent-clipped=0.0 2023-09-28 15:34:02,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:05,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:05,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:34:06,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:06,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 15:34:08,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:34:08,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 15:34:08,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:34:08,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 15:34:10,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:12,930 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=62146.666666666664, ans=0.125 2023-09-28 15:34:15,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:34:16,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:34:16,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:34:16,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:16,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:34:16,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=62146.666666666664, ans=0.125 2023-09-28 15:34:17,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:34:19,441 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 15:34:20,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:34:20,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:24,104 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 15:34:24,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:34:24,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:33,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 15:34:33,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:35,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:34:37,056 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 15:34:38,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:34:38,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 15:34:38,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:34:40,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=62280.0, ans=0.035 2023-09-28 15:34:41,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:41,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:34:43,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:34:45,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:34:45,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:47,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 15:34:47,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:50,149 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 15:34:53,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:34:54,856 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.44 vs. limit=15.0 2023-09-28 15:34:56,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:34:57,330 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=62346.666666666664, ans=0.2 2023-09-28 15:34:59,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:34:59,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:01,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:35:01,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:07,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:11,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:35:11,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 15:35:12,616 INFO [train.py:1039] (0/4) Epoch 2, batch 4050, loss[loss=0.3035, simple_loss=0.3382, pruned_loss=0.1344, over 23761.00 frames. ], tot_loss[loss=0.3084, simple_loss=0.3493, pruned_loss=0.1337, over 4718131.73 frames. ], batch size: 164, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:35:14,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:35:14,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:15,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:35:17,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:17,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:22,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:24,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:35:25,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:35:27,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:35:27,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:35:32,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:34,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:37,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 15:35:38,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 15:35:38,862 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 15:35:41,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:35:48,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 15:35:50,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:35:54,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:54,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=62546.666666666664, ans=0.125 2023-09-28 15:35:57,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:57,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:35:57,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:36:00,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:36:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 15:36:04,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:36:07,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:09,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 15:36:14,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:18,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=62680.0, ans=0.0 2023-09-28 15:36:23,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 15:36:24,358 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.46 vs. limit=22.5 2023-09-28 15:36:25,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:36:25,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:36:26,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 15:36:26,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 15:36:26,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:29,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:36:31,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:36:35,787 INFO [train.py:1039] (0/4) Epoch 2, batch 4100, loss[loss=0.2537, simple_loss=0.3193, pruned_loss=0.09399, over 24682.00 frames. ], tot_loss[loss=0.3075, simple_loss=0.3489, pruned_loss=0.133, over 4733628.41 frames. ], batch size: 65, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:36:37,643 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=62746.666666666664, ans=0.0 2023-09-28 15:36:38,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 15:36:39,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 15:36:40,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=62746.666666666664, ans=0.0 2023-09-28 15:36:42,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 15:36:44,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 15:36:44,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:44,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:36:45,995 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 15:36:47,356 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.099e+02 2.677e+02 3.262e+02 4.112e+02 6.784e+02, threshold=6.525e+02, percent-clipped=2.0 2023-09-28 15:36:49,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:49,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:36:49,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:51,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:36:56,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:36:56,858 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.78 vs. limit=6.0 2023-09-28 15:36:58,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:58,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:36:58,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 15:36:59,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:59,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:36:59,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:01,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:37:01,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 15:37:06,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:07,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 15:37:07,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:37:12,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:12,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 15:37:13,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:37:13,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:37:13,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:37:16,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 15:37:19,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:37:19,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:37:22,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 15:37:23,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:37:23,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:27,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:29,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=62946.666666666664, ans=0.2 2023-09-28 15:37:32,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:37:35,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:35,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:37:38,952 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.24 vs. limit=12.0 2023-09-28 15:37:46,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:37:46,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:48,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:49,366 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.51 vs. limit=22.5 2023-09-28 15:37:51,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:37:58,066 INFO [train.py:1039] (0/4) Epoch 2, batch 4150, loss[loss=0.2974, simple_loss=0.3366, pruned_loss=0.1291, over 24358.00 frames. ], tot_loss[loss=0.3078, simple_loss=0.3493, pruned_loss=0.1331, over 4730791.05 frames. ], batch size: 56, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:37:58,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:58,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:37:59,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:37:59,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:03,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 15:38:03,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:03,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 15:38:05,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 15:38:05,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 15:38:06,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:11,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:38:11,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:15,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:17,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:38:18,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:38:20,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:38:20,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:21,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:38:25,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:30,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:31,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 15:38:34,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 15:38:34,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:38:36,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 15:38:36,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:38:36,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:38:39,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:38:39,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:45,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 15:38:48,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:38:51,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:38:52,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 15:38:53,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:55,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 15:38:55,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:38:58,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:39:00,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:00,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 15:39:00,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:00,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:39:03,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:39:06,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 15:39:07,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:07,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:39:07,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:39:07,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 15:39:09,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:39:09,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:39:10,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:39:14,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:14,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 15:39:14,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:39:16,086 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=63346.666666666664, ans=0.1 2023-09-28 15:39:19,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:39:21,071 INFO [train.py:1039] (0/4) Epoch 2, batch 4200, loss[loss=0.3016, simple_loss=0.3313, pruned_loss=0.136, over 23690.00 frames. ], tot_loss[loss=0.3063, simple_loss=0.347, pruned_loss=0.1328, over 4719439.74 frames. ], batch size: 149, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:39:21,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 15:39:24,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:39:26,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:27,238 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.20 vs. limit=15.0 2023-09-28 15:39:27,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:39:27,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:27,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:30,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 15:39:32,114 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.187e+02 2.868e+02 3.365e+02 4.143e+02 5.998e+02, threshold=6.730e+02, percent-clipped=0.0 2023-09-28 15:39:33,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 15:39:33,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:37,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:39,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:39:42,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:39:46,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:39:46,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:47,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 15:39:47,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:49,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:49,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:49,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:39:52,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:39:55,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 15:39:55,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:59,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:40:01,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:40:02,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:40:02,966 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=63546.666666666664, ans=0.0 2023-09-28 15:40:04,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:07,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:40:07,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 15:40:07,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:08,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:40:14,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:40:17,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:22,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:40:25,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 15:40:29,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:33,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=63680.0, ans=0.2 2023-09-28 15:40:34,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:40:34,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:37,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 15:40:42,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:40:43,865 INFO [train.py:1039] (0/4) Epoch 2, batch 4250, loss[loss=0.2961, simple_loss=0.3573, pruned_loss=0.1174, over 24320.00 frames. ], tot_loss[loss=0.3052, simple_loss=0.3464, pruned_loss=0.132, over 4726237.59 frames. ], batch size: 74, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:40:46,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:46,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:40:47,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:54,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:40:55,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 15:40:55,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:58,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:01,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:02,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=63813.333333333336, ans=0.1 2023-09-28 15:41:05,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:07,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:08,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:41:08,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:10,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:10,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:11,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:12,758 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.71 vs. limit=15.0 2023-09-28 15:41:13,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:41:15,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:16,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 15:41:20,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 15:41:21,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:22,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:22,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:23,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:41:23,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:23,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:28,114 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=17.20 vs. limit=15.0 2023-09-28 15:41:28,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:41:30,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:41:34,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:41:35,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:35,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 15:41:35,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:41:37,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 15:41:39,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:41:40,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:41:43,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:43,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:45,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 15:41:47,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:41:48,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:41:48,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=63946.666666666664, ans=0.1 2023-09-28 15:41:51,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:55,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:56,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:41:59,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:41:59,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=64013.333333333336, ans=0.125 2023-09-28 15:42:00,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:02,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:42:02,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:02,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 15:42:02,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=64013.333333333336, ans=0.125 2023-09-28 15:42:05,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:09,190 INFO [train.py:1039] (0/4) Epoch 2, batch 4300, loss[loss=0.2842, simple_loss=0.336, pruned_loss=0.1162, over 24587.00 frames. ], tot_loss[loss=0.3046, simple_loss=0.3458, pruned_loss=0.1318, over 4715441.70 frames. ], batch size: 60, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:42:12,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:12,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:15,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:19,710 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.736e+02 3.234e+02 3.981e+02 6.423e+02, threshold=6.467e+02, percent-clipped=0.0 2023-09-28 15:42:23,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:42:23,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 15:42:24,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:42:26,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:42:26,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:42:26,275 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 15:42:29,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:42:30,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.32 vs. limit=6.0 2023-09-28 15:42:32,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:42:36,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 15:42:36,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:42:36,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 15:42:40,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:42:42,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:42:45,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:42:45,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:47,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:42:48,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:48,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:48,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 15:42:50,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 15:42:50,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=64213.333333333336, ans=0.125 2023-09-28 15:42:53,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:42:56,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:42:56,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:56,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 15:42:56,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 15:42:56,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=64280.0, ans=0.1 2023-09-28 15:42:57,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 15:42:59,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:42:59,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=64280.0, ans=0.035 2023-09-28 15:43:00,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 15:43:00,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 15:43:06,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:08,191 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 15:43:10,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:43:12,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:12,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:15,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 15:43:17,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:43:17,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:17,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:19,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:43:22,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:43:23,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=64346.666666666664, ans=0.0 2023-09-28 15:43:25,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:26,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:26,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:29,615 INFO [train.py:1039] (0/4) Epoch 2, batch 4350, loss[loss=0.3134, simple_loss=0.3636, pruned_loss=0.1317, over 23972.00 frames. ], tot_loss[loss=0.3038, simple_loss=0.3463, pruned_loss=0.1307, over 4727754.91 frames. ], batch size: 86, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:43:32,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 15:43:34,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:43:38,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:43:40,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:44,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:43:44,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:43:49,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:43:53,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:53,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=64480.0, ans=0.125 2023-09-28 15:43:55,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:43:55,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:55,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=64480.0, ans=0.0 2023-09-28 15:43:59,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:44:02,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:44:04,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:44:07,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=64546.666666666664, ans=0.0 2023-09-28 15:44:09,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 15:44:11,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:12,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:16,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:20,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 15:44:22,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:24,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:44:31,245 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 15:44:32,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:32,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:44:32,880 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 15:44:34,345 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 15:44:34,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:34,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:35,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:44:37,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:37,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:38,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:44:40,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 15:44:40,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:40,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:40,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:42,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 15:44:42,321 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 15:44:42,328 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 15:44:42,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 15:44:46,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:44:46,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:44:46,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:44:48,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:44:50,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 15:44:51,881 INFO [train.py:1039] (0/4) Epoch 2, batch 4400, loss[loss=0.2412, simple_loss=0.3035, pruned_loss=0.08942, over 24601.00 frames. ], tot_loss[loss=0.3047, simple_loss=0.3472, pruned_loss=0.1311, over 4721383.45 frames. ], batch size: 60, lr: 3.45e-02, grad_scale: 32.0 2023-09-28 15:44:52,102 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 15:44:52,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:56,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:44:56,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:58,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:45:00,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 15:45:02,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 15:45:02,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 15:45:02,459 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 15:45:03,801 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.169e+02 2.849e+02 3.157e+02 3.871e+02 7.582e+02, threshold=6.315e+02, percent-clipped=2.0 2023-09-28 15:45:03,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:45:03,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:45:04,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=64746.666666666664, ans=0.125 2023-09-28 15:45:05,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 15:45:08,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:10,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:10,167 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 15:45:13,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:13,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 15:45:13,304 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 15:45:17,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 15:45:17,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 15:45:17,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 15:45:19,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:19,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:22,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 15:45:22,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 15:45:22,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:26,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:45:26,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:27,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:29,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:30,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 15:45:30,130 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 15:45:33,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:40,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:43,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 15:45:45,579 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.48 vs. limit=15.0 2023-09-28 15:45:48,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:45:51,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:45:54,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:45:54,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 15:45:54,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:45:54,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:45:54,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:45:54,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=64946.666666666664, ans=0.125 2023-09-28 15:45:55,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:46:00,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 15:46:04,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 15:46:05,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 15:46:05,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:05,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 15:46:08,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:46:12,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:46:14,485 INFO [train.py:1039] (0/4) Epoch 2, batch 4450, loss[loss=0.2851, simple_loss=0.3356, pruned_loss=0.1174, over 24635.00 frames. ], tot_loss[loss=0.3063, simple_loss=0.3485, pruned_loss=0.132, over 4724707.08 frames. ], batch size: 60, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:46:14,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 15:46:17,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:46:20,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:22,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:46:29,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:46:29,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:46:34,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:34,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=65146.666666666664, ans=0.0 2023-09-28 15:46:37,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:46:39,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:46:41,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:42,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 15:46:42,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:42,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:43,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:46:43,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:46:43,245 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=65146.666666666664, ans=0.1 2023-09-28 15:46:43,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=65146.666666666664, ans=0.0 2023-09-28 15:46:46,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:46:49,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=65213.333333333336, ans=0.125 2023-09-28 15:46:50,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:51,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:53,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:53,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:55,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:46:56,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=65213.333333333336, ans=0.125 2023-09-28 15:46:58,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:46:59,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 15:46:59,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 15:46:59,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:47:01,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:02,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 15:47:07,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:47:09,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:11,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 15:47:11,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:11,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:11,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:47:11,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:11,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=65280.0, ans=0.1 2023-09-28 15:47:15,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:20,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:47:21,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 15:47:23,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:47:23,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=65346.666666666664, ans=0.125 2023-09-28 15:47:26,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:47:26,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:28,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:28,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:47:31,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:47:34,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=65413.333333333336, ans=0.0 2023-09-28 15:47:35,424 INFO [train.py:1039] (0/4) Epoch 2, batch 4500, loss[loss=0.2869, simple_loss=0.3381, pruned_loss=0.1178, over 24473.00 frames. ], tot_loss[loss=0.3054, simple_loss=0.3479, pruned_loss=0.1315, over 4723637.34 frames. ], batch size: 63, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:47:35,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 15:47:37,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:47:41,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:42,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=65413.333333333336, ans=0.125 2023-09-28 15:47:43,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 15:47:43,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 15:47:43,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=65413.333333333336, ans=0.125 2023-09-28 15:47:44,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:47:46,456 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.128e+02 2.918e+02 3.364e+02 4.065e+02 7.320e+02, threshold=6.729e+02, percent-clipped=3.0 2023-09-28 15:47:50,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:50,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:51,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:47:53,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:47:53,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:47:53,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:47:53,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=65480.0, ans=0.0 2023-09-28 15:48:06,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:48:06,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:48:09,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:11,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:48:11,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:48:16,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:48:16,376 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=65546.66666666667, ans=0.025 2023-09-28 15:48:20,688 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.88 vs. limit=12.0 2023-09-28 15:48:22,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:48:26,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:48:28,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:48:29,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 15:48:30,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:31,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:32,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=65613.33333333333, ans=0.125 2023-09-28 15:48:33,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:33,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:36,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:48:36,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 15:48:36,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:48:36,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:42,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:48:42,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:48:44,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:45,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:48:46,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:48:47,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 15:48:49,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 15:48:51,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 15:48:51,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=65680.0, ans=0.125 2023-09-28 15:48:56,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 15:48:57,878 INFO [train.py:1039] (0/4) Epoch 2, batch 4550, loss[loss=0.3088, simple_loss=0.3337, pruned_loss=0.1419, over 23819.00 frames. ], tot_loss[loss=0.3052, simple_loss=0.3475, pruned_loss=0.1315, over 4726777.42 frames. ], batch size: 179, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:48:58,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 15:48:58,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:01,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:01,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:06,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:08,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=65746.66666666667, ans=0.125 2023-09-28 15:49:11,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:49:11,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=65746.66666666667, ans=0.125 2023-09-28 15:49:13,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:49:16,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:16,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:49:16,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:18,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=65813.33333333333, ans=0.2 2023-09-28 15:49:19,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:19,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:23,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:49:26,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 15:49:27,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 15:49:27,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:49:29,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 15:49:33,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 15:49:33,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:37,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 15:49:38,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:49:41,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:49:45,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 15:49:47,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:49:49,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:49,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:51,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:52,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 15:49:52,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 15:49:54,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:49:54,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 15:49:56,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 15:49:58,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:58,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:58,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:00,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:01,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:50:01,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:50:03,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 15:50:06,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:50:06,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:50:06,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 15:50:06,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:50:06,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 15:50:10,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:50:11,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:50:13,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:50:13,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:14,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:50:17,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:50:18,867 INFO [train.py:1039] (0/4) Epoch 2, batch 4600, loss[loss=0.314, simple_loss=0.3468, pruned_loss=0.1406, over 23609.00 frames. ], tot_loss[loss=0.3034, simple_loss=0.3451, pruned_loss=0.1308, over 4715252.55 frames. ], batch size: 256, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:50:19,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:50:20,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:22,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:22,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=66080.0, ans=0.125 2023-09-28 15:50:25,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:50:25,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:50:26,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:28,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 15:50:29,724 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.215e+02 2.622e+02 3.070e+02 3.813e+02 6.355e+02, threshold=6.141e+02, percent-clipped=0.0 2023-09-28 15:50:29,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:50:30,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=66080.0, ans=0.0 2023-09-28 15:50:33,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:50:33,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:35,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:42,246 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=66146.66666666667, ans=0.2 2023-09-28 15:50:42,694 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.22 vs. limit=6.0 2023-09-28 15:50:43,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 15:50:45,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:48,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:49,288 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-09-28 15:50:51,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:50:51,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:57,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 15:50:57,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:50:57,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:51:04,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:05,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:51:06,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:51:06,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=66280.0, ans=0.0 2023-09-28 15:51:11,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 15:51:12,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:51:16,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:16,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:51:20,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:20,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 15:51:20,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:22,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 15:51:22,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:23,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:23,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=66346.66666666667, ans=0.02 2023-09-28 15:51:24,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:25,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=66346.66666666667, ans=0.125 2023-09-28 15:51:26,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:51:27,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:27,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 15:51:29,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 15:51:29,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 15:51:29,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:32,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:32,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:33,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:41,364 INFO [train.py:1039] (0/4) Epoch 2, batch 4650, loss[loss=0.3192, simple_loss=0.3608, pruned_loss=0.1388, over 23556.00 frames. ], tot_loss[loss=0.3024, simple_loss=0.3452, pruned_loss=0.1298, over 4721534.44 frames. ], batch size: 106, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:51:44,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:51:47,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:51:49,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:49,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:51:50,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:50,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:50,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:56,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 15:51:58,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:51:59,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 15:51:59,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:52:01,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 15:52:01,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:52:02,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 15:52:02,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 15:52:02,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:04,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:52:07,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:52:08,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:09,003 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 15:52:14,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:15,289 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=2.573e-03 2023-09-28 15:52:16,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 15:52:18,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:18,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:52:18,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=66546.66666666667, ans=0.2 2023-09-28 15:52:19,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 15:52:21,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:52:24,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:52:29,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:33,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:33,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=66613.33333333333, ans=0.125 2023-09-28 15:52:36,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:36,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:36,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:52:39,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 15:52:39,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 15:52:39,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 15:52:39,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 15:52:43,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:52:45,034 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=66613.33333333333, ans=0.0 2023-09-28 15:52:48,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:52:48,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:52:48,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 15:52:50,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:51,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:51,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:52:53,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:52:55,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:52:55,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:55,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:53:01,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:01,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:53:01,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:53:03,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 15:53:03,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:53:05,176 INFO [train.py:1039] (0/4) Epoch 2, batch 4700, loss[loss=0.2683, simple_loss=0.3232, pruned_loss=0.1067, over 15820.00 frames. ], tot_loss[loss=0.3038, simple_loss=0.3462, pruned_loss=0.1307, over 4707544.76 frames. ], batch size: 34, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:53:05,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=66746.66666666667, ans=0.0 2023-09-28 15:53:06,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 15:53:13,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=66746.66666666667, ans=0.0 2023-09-28 15:53:13,870 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.31 vs. limit=22.5 2023-09-28 15:53:14,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:15,873 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.037e+02 2.802e+02 3.291e+02 3.873e+02 6.346e+02, threshold=6.582e+02, percent-clipped=1.0 2023-09-28 15:53:15,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:16,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:53:16,354 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:53:18,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:19,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:53:19,833 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=66813.33333333333, ans=0.125 2023-09-28 15:53:19,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=66813.33333333333, ans=0.0 2023-09-28 15:53:26,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 15:53:26,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 15:53:28,117 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:53:29,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:30,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:53:30,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:53:34,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:40,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:53:41,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:53:44,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:51,352 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.36 vs. limit=15.0 2023-09-28 15:53:51,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 15:53:54,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:53:55,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:53:59,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 15:54:00,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:05,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:54:06,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 15:54:07,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:08,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:10,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:54:10,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:54:10,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 15:54:12,378 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 15:54:13,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:17,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 15:54:18,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:19,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=67013.33333333333, ans=0.125 2023-09-28 15:54:22,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 15:54:23,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:54:24,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:27,490 INFO [train.py:1039] (0/4) Epoch 2, batch 4750, loss[loss=0.2972, simple_loss=0.3341, pruned_loss=0.1302, over 23635.00 frames. ], tot_loss[loss=0.3044, simple_loss=0.3467, pruned_loss=0.131, over 4716517.30 frames. ], batch size: 232, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:54:31,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:31,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:54:32,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 15:54:34,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:54:38,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 15:54:39,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=67080.0, ans=0.0 2023-09-28 15:54:40,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:54:40,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:41,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:42,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=67146.66666666667, ans=0.125 2023-09-28 15:54:47,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 15:54:51,164 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.21 vs. limit=15.0 2023-09-28 15:54:51,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:54:52,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 15:54:54,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:55,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:55,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:57,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:59,158 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 15:54:59,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 15:54:59,745 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.08 vs. limit=15.0 2023-09-28 15:55:04,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 15:55:05,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:07,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:10,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:55:10,640 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 15:55:10,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:12,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:55:17,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:55:18,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 15:55:18,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 15:55:20,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:55:20,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:55:20,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:23,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:55:23,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 15:55:26,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 15:55:28,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:55:33,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:55:33,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 15:55:33,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:55:33,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=67346.66666666667, ans=0.125 2023-09-28 15:55:35,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:36,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:55:38,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:38,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:55:39,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=67346.66666666667, ans=0.125 2023-09-28 15:55:43,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:43,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 15:55:44,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 15:55:46,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 15:55:48,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:55:48,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:49,538 INFO [train.py:1039] (0/4) Epoch 2, batch 4800, loss[loss=0.2943, simple_loss=0.3521, pruned_loss=0.1182, over 24692.00 frames. ], tot_loss[loss=0.3043, simple_loss=0.3473, pruned_loss=0.1307, over 4721452.75 frames. ], batch size: 65, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:55:51,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 15:55:56,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:57,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:01,840 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.813e+02 3.481e+02 4.018e+02 6.093e+02, threshold=6.961e+02, percent-clipped=0.0 2023-09-28 15:56:04,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:56:05,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:05,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:05,831 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=67480.0, ans=0.125 2023-09-28 15:56:07,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 15:56:08,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:56:08,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:56:11,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:56:15,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:15,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=67480.0, ans=0.025 2023-09-28 15:56:18,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:18,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:56:20,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:20,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:56:20,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:21,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:22,379 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.72 vs. limit=15.0 2023-09-28 15:56:24,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:27,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:56:32,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:56:33,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:34,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 15:56:36,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 15:56:37,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:37,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:56:37,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:56:37,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:37,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:56:39,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:56:41,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:43,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=67613.33333333333, ans=0.125 2023-09-28 15:56:46,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:56:47,275 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.74 vs. limit=22.5 2023-09-28 15:56:48,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:50,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:56:55,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 15:56:55,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:57,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:57,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:56:58,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:02,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:57:04,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:57:04,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:04,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:57:05,136 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.56 vs. limit=10.0 2023-09-28 15:57:05,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:57:05,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:57:07,585 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=67680.0, ans=0.125 2023-09-28 15:57:10,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:10,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:10,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:57:11,597 INFO [train.py:1039] (0/4) Epoch 2, batch 4850, loss[loss=0.2972, simple_loss=0.3407, pruned_loss=0.1268, over 23529.00 frames. ], tot_loss[loss=0.3044, simple_loss=0.348, pruned_loss=0.1304, over 4740418.45 frames. ], batch size: 106, lr: 3.40e-02, grad_scale: 32.0 2023-09-28 15:57:11,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 15:57:14,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 15:57:14,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:14,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:15,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:15,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:18,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:26,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 15:57:26,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=67746.66666666667, ans=0.125 2023-09-28 15:57:27,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:33,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:33,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:57:33,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:40,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:40,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:57:41,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:57:41,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 15:57:46,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:47,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:57:47,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:57:49,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:57:49,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 15:57:50,271 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.70 vs. limit=22.5 2023-09-28 15:57:52,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:52,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:55,113 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=67880.0, ans=0.125 2023-09-28 15:57:56,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:56,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 15:57:56,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 15:57:56,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=67880.0, ans=0.1 2023-09-28 15:57:57,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:57:58,468 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.39 vs. limit=12.0 2023-09-28 15:58:06,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:58:06,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 15:58:06,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=67946.66666666667, ans=0.125 2023-09-28 15:58:08,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:58:08,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:58:11,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:58:11,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=67946.66666666667, ans=0.1 2023-09-28 15:58:14,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 15:58:14,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:16,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 15:58:16,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:16,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:18,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 15:58:18,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=68013.33333333333, ans=0.0 2023-09-28 15:58:25,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:29,242 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.75 vs. limit=15.0 2023-09-28 15:58:31,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:58:31,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:34,387 INFO [train.py:1039] (0/4) Epoch 2, batch 4900, loss[loss=0.2809, simple_loss=0.332, pruned_loss=0.1149, over 21071.00 frames. ], tot_loss[loss=0.3046, simple_loss=0.3483, pruned_loss=0.1305, over 4744722.07 frames. ], batch size: 46, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:58:37,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 15:58:37,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:58:42,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:42,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:42,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:58:46,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 15:58:47,477 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.052e+02 2.694e+02 3.057e+02 3.718e+02 7.972e+02, threshold=6.114e+02, percent-clipped=1.0 2023-09-28 15:58:50,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 15:58:54,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 15:58:54,645 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.30 vs. limit=15.0 2023-09-28 15:58:55,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 15:58:57,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:58:57,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:57,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:58:57,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:57,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:58:59,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 15:59:02,484 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.24 vs. limit=15.0 2023-09-28 15:59:03,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 15:59:03,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:59:06,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:59:06,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:59:07,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=68213.33333333333, ans=0.125 2023-09-28 15:59:09,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:59:09,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:10,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:10,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 15:59:12,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:59:13,330 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.94 vs. limit=12.0 2023-09-28 15:59:14,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:59:14,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 15:59:14,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 15:59:17,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 15:59:20,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:59:22,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:59:22,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:59:22,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=68280.0, ans=0.125 2023-09-28 15:59:23,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:23,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:59:25,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:59:25,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 15:59:26,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:28,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:59:30,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=68280.0, ans=0.125 2023-09-28 15:59:31,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:59:35,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 15:59:35,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:59:37,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:59:38,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 15:59:43,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=68346.66666666667, ans=0.125 2023-09-28 15:59:44,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:46,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:59:47,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 15:59:47,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:59:47,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:59:50,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:50,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=68346.66666666667, ans=0.2 2023-09-28 15:59:54,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:59:54,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:59:54,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:54,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:59:55,941 INFO [train.py:1039] (0/4) Epoch 2, batch 4950, loss[loss=0.3122, simple_loss=0.3655, pruned_loss=0.1294, over 24051.00 frames. ], tot_loss[loss=0.3031, simple_loss=0.3462, pruned_loss=0.13, over 4732420.95 frames. ], batch size: 86, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:59:56,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=68413.33333333333, ans=0.0 2023-09-28 15:59:57,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:00:00,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:00,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:00:03,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 16:00:03,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 16:00:03,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:00:05,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 16:00:05,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:05,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:00:06,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:00:08,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:08,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=68413.33333333333, ans=0.0 2023-09-28 16:00:10,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:11,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:00:13,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:00:15,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:16,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:18,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:00:19,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:00:25,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:26,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:00:28,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:29,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:30,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=68546.66666666667, ans=0.1 2023-09-28 16:00:30,742 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.25 vs. limit=15.0 2023-09-28 16:00:31,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:00:31,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 16:00:33,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 16:00:36,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:37,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:00:37,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:00:38,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:00:38,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:00:39,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:00:41,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:45,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:00:47,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:00:48,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:50,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:50,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 16:00:51,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:00:52,327 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=68613.33333333333, ans=0.125 2023-09-28 16:00:53,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:00:58,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:01:00,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:01:01,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:01:01,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:01,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:01:03,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:01:04,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:01:05,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:01:06,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:01:07,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 16:01:10,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:15,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 16:01:15,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:01:16,659 INFO [train.py:1039] (0/4) Epoch 2, batch 5000, loss[loss=0.3224, simple_loss=0.355, pruned_loss=0.1449, over 23790.00 frames. ], tot_loss[loss=0.3018, simple_loss=0.3451, pruned_loss=0.1292, over 4731113.79 frames. ], batch size: 164, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:01:20,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=68746.66666666667, ans=0.0 2023-09-28 16:01:22,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:22,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:24,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 16:01:24,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=68746.66666666667, ans=0.125 2023-09-28 16:01:25,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 16:01:27,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:01:30,663 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.879e+02 2.855e+02 3.346e+02 4.050e+02 6.399e+02, threshold=6.691e+02, percent-clipped=1.0 2023-09-28 16:01:30,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 16:01:30,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:01:31,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:01:31,205 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=68746.66666666667, ans=0.2 2023-09-28 16:01:33,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 16:01:33,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:33,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:01:35,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 16:01:35,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:36,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:01:38,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 16:01:38,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 16:01:38,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:01:40,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 16:01:40,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:01:40,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:41,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:01:41,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 16:01:41,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 16:01:43,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=68813.33333333333, ans=0.035 2023-09-28 16:01:43,918 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.57 vs. limit=15.0 2023-09-28 16:01:44,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 16:01:44,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:45,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:46,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 16:01:47,199 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.11 vs. limit=15.0 2023-09-28 16:01:48,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:51,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:52,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:54,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:01:56,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 16:01:58,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:01:58,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:02:03,473 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 16:02:08,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:02:10,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:02:10,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:13,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 16:02:13,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:02:13,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:14,512 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.05 vs. limit=10.0 2023-09-28 16:02:15,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:16,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 16:02:17,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:19,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:21,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:27,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 16:02:31,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:39,041 INFO [train.py:1039] (0/4) Epoch 2, batch 5050, loss[loss=0.3109, simple_loss=0.3462, pruned_loss=0.1378, over 23818.00 frames. ], tot_loss[loss=0.3032, simple_loss=0.3458, pruned_loss=0.1303, over 4726284.84 frames. ], batch size: 164, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:02:41,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:41,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:42,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:02:42,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:42,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:02:42,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:02:43,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 16:02:47,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:02:50,884 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=69080.0, ans=0.2 2023-09-28 16:02:51,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:52,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=69080.0, ans=0.125 2023-09-28 16:02:52,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=69080.0, ans=0.125 2023-09-28 16:02:53,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:02:53,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 16:02:53,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:55,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:56,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:02:58,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:02:58,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:03:09,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 16:03:09,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:03:09,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:11,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 16:03:11,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:13,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:14,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:03:14,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:03:14,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 16:03:16,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 16:03:17,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:19,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:23,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:23,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 16:03:24,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:28,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 16:03:28,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:03:28,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:03:30,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:31,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:31,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:03:35,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:03:35,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:35,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:03:35,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:03:35,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 16:03:37,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:03:39,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:43,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:43,372 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 16:03:43,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:03:44,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:03:45,479 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.93 vs. limit=22.5 2023-09-28 16:03:46,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:46,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 16:03:50,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:50,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 16:03:50,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:53,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:55,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:55,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 16:03:56,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 16:04:00,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:00,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:00,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:04:01,405 INFO [train.py:1039] (0/4) Epoch 2, batch 5100, loss[loss=0.2509, simple_loss=0.3032, pruned_loss=0.09927, over 24269.00 frames. ], tot_loss[loss=0.3024, simple_loss=0.3458, pruned_loss=0.1295, over 4734662.28 frames. ], batch size: 56, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:04:03,189 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 16:04:04,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:04:05,157 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=69413.33333333333, ans=0.0 2023-09-28 16:04:10,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 16:04:10,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 16:04:10,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:13,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:04:15,245 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.986e+02 2.824e+02 3.084e+02 3.697e+02 6.472e+02, threshold=6.168e+02, percent-clipped=0.0 2023-09-28 16:04:17,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:04:17,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 16:04:17,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 16:04:24,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:04:24,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:04:27,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:32,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 16:04:32,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:32,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:04:32,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 16:04:35,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 16:04:39,797 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 16:04:39,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:41,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 16:04:41,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 16:04:46,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:55,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:04:55,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=69613.33333333333, ans=0.1 2023-09-28 16:04:59,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 16:04:59,795 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 16:04:59,818 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 16:05:01,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 16:05:01,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:05:04,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 16:05:07,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 16:05:10,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 16:05:12,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:05:14,617 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.60 vs. limit=10.0 2023-09-28 16:05:15,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 16:05:17,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:05:18,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 16:05:23,791 INFO [train.py:1039] (0/4) Epoch 2, batch 5150, loss[loss=0.2637, simple_loss=0.3203, pruned_loss=0.1035, over 24595.00 frames. ], tot_loss[loss=0.3032, simple_loss=0.3465, pruned_loss=0.1299, over 4735475.08 frames. ], batch size: 60, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:05:25,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:05:25,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:05:25,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:05:25,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:05:25,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:05:27,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:05:30,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 16:05:30,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 16:05:30,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 16:05:30,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:05:30,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 16:05:32,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:33,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:05:35,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:35,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=69746.66666666667, ans=0.125 2023-09-28 16:05:37,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:37,683 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.46 vs. limit=6.0 2023-09-28 16:05:41,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:05:41,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 16:05:44,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:45,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:05:47,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:05:47,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:05:47,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:05:48,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:05:48,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:05:48,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 16:05:50,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:05:52,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:05:53,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:05:54,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 16:05:55,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:05:55,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=69880.0, ans=0.125 2023-09-28 16:06:02,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:06:02,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 16:06:04,768 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=69880.0, ans=0.0 2023-09-28 16:06:06,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:12,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:13,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:14,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=69946.66666666667, ans=0.125 2023-09-28 16:06:16,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:18,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:20,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 16:06:25,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:06:27,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:06:27,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:06:30,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:30,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:32,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 16:06:37,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:39,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:06:42,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:42,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:06:43,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:06:43,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:06:43,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:06:43,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:06:45,227 INFO [train.py:1039] (0/4) Epoch 2, batch 5200, loss[loss=0.3002, simple_loss=0.3662, pruned_loss=0.1171, over 24281.00 frames. ], tot_loss[loss=0.3055, simple_loss=0.3479, pruned_loss=0.1315, over 4717463.77 frames. ], batch size: 77, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:06:48,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:06:48,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:06:53,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:55,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 16:06:57,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:06:58,451 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.056e+02 2.942e+02 3.378e+02 4.176e+02 6.037e+02, threshold=6.756e+02, percent-clipped=0.0 2023-09-28 16:06:58,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:00,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:02,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:07:02,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:04,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 16:07:05,156 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.91 vs. limit=15.0 2023-09-28 16:07:07,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:07:08,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:12,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 16:07:14,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:07:14,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:07:16,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 16:07:17,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 16:07:20,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 16:07:22,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:22,255 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 16:07:22,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:23,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:23,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:07:25,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 16:07:25,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:07:25,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=70213.33333333333, ans=0.125 2023-09-28 16:07:27,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:32,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 16:07:32,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 16:07:32,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 16:07:35,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 16:07:37,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:07:42,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:07:44,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:07:45,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 16:07:47,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:47,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:07:47,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:47,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:07:52,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:07:53,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:07:55,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:58,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:07:58,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:03,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:04,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 16:08:04,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=70346.66666666667, ans=0.0 2023-09-28 16:08:06,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:08:06,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:08:08,684 INFO [train.py:1039] (0/4) Epoch 2, batch 5250, loss[loss=0.2718, simple_loss=0.3143, pruned_loss=0.1146, over 24427.00 frames. ], tot_loss[loss=0.3038, simple_loss=0.3469, pruned_loss=0.1304, over 4721352.89 frames. ], batch size: 58, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:08:08,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:08,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:08:10,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:08:12,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:08:16,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:16,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:08:18,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:08:25,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:26,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:08:28,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:08:28,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=70480.0, ans=0.0 2023-09-28 16:08:30,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:08:33,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 16:08:33,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:34,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:40,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=70546.66666666667, ans=0.0 2023-09-28 16:08:42,262 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=24.40 vs. limit=22.5 2023-09-28 16:09:18,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.31 vs. limit=15.0 2023-09-28 16:09:19,458 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=70680.0, ans=15.0 2023-09-28 16:09:20,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=70680.0, ans=0.2 2023-09-28 16:09:22,860 INFO [train.py:1039] (0/4) Epoch 2, batch 5300, loss[loss=0.2717, simple_loss=0.321, pruned_loss=0.1112, over 24361.00 frames. ], tot_loss[loss=0.3027, simple_loss=0.3454, pruned_loss=0.13, over 4719239.43 frames. ], batch size: 56, lr: 3.35e-02, grad_scale: 32.0 2023-09-28 16:09:34,301 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.707e+02 3.072e+02 3.599e+02 7.324e+02, threshold=6.143e+02, percent-clipped=3.0 2023-09-28 16:09:37,799 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-2.pt 2023-09-28 16:09:45,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:09:45,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 16:09:45,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 16:09:45,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:45,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:45,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:45,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:45,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:45,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:09:46,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:46,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:09:46,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:09:46,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 16:09:46,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 16:09:46,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 16:09:46,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:09:47,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 16:09:47,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 16:09:47,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:48,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:48,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:48,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:48,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:09:48,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:48,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:49,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:49,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:49,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:49,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:09:49,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:49,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:09:50,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 16:09:50,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:50,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:50,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 16:09:50,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 16:09:50,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:09:50,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:09:50,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 16:09:51,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 16:09:51,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:52,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:09:52,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:52,737 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 16:09:52,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 16:09:52,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:09:53,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:53,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 16:09:53,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 16:09:53,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 16:09:53,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:56,589 INFO [train.py:1039] (0/4) Epoch 3, batch 0, loss[loss=0.2808, simple_loss=0.333, pruned_loss=0.1143, over 24418.00 frames. ], tot_loss[loss=0.2808, simple_loss=0.333, pruned_loss=0.1143, over 24418.00 frames. ], batch size: 58, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:09:56,590 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 16:10:11,592 INFO [train.py:1071] (0/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3654, pruned_loss=0.2147, over 1125622.00 frames. 2023-09-28 16:10:11,593 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 16:10:14,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 16:10:15,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=70826.66666666667, ans=0.0 2023-09-28 16:10:16,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:10:17,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:10:23,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:23,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:10:23,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:23,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 16:10:26,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 16:10:29,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:31,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:35,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:36,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:37,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=70893.33333333333, ans=0.0 2023-09-28 16:10:38,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:10:38,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:39,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 16:10:44,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:51,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:10:51,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:53,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 16:10:57,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:10:58,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:10:58,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:02,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:11:05,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:05,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=71026.66666666667, ans=0.125 2023-09-28 16:11:09,086 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.30 vs. limit=10.0 2023-09-28 16:11:09,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 16:11:10,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=71026.66666666667, ans=0.1 2023-09-28 16:11:12,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 16:11:12,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:12,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:15,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:11:15,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:11:17,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 16:11:19,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:21,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:24,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:11:30,030 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 16:11:31,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:11:32,923 INFO [train.py:1039] (0/4) Epoch 3, batch 50, loss[loss=0.2898, simple_loss=0.3455, pruned_loss=0.1171, over 24633.00 frames. ], tot_loss[loss=0.3064, simple_loss=0.3491, pruned_loss=0.1318, over 1068237.20 frames. ], batch size: 68, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:11:33,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:36,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:36,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 16:11:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:11:37,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:11:39,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:39,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:43,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:45,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 16:11:45,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:51,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:11:52,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 16:11:54,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 16:11:56,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=71226.66666666667, ans=0.125 2023-09-28 16:11:57,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:11:59,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:11:59,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:12:01,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:01,432 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=71226.66666666667, ans=0.5 2023-09-28 16:12:03,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:12:03,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:12:03,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:12:03,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=71226.66666666667, ans=0.1 2023-09-28 16:12:11,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:12,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:12,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:12:14,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 16:12:15,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:12:17,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:12:17,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 16:12:17,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:19,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 16:12:27,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:12:27,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:27,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:29,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:12:31,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:33,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 16:12:33,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 16:12:35,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:36,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:38,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:40,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:40,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 16:12:42,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 16:12:43,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:12:44,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.88 vs. limit=22.5 2023-09-28 16:12:44,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:46,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:12:47,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 16:12:47,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 16:12:49,333 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.184e+02 2.852e+02 3.312e+02 4.404e+02 9.515e+02, threshold=6.623e+02, percent-clipped=7.0 2023-09-28 16:12:49,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:49,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:50,172 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.42 vs. limit=15.0 2023-09-28 16:12:51,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:12:51,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:12:54,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:12:55,750 INFO [train.py:1039] (0/4) Epoch 3, batch 100, loss[loss=0.2784, simple_loss=0.35, pruned_loss=0.1034, over 24536.00 frames. ], tot_loss[loss=0.3025, simple_loss=0.3471, pruned_loss=0.1289, over 1892147.66 frames. ], batch size: 71, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:12:57,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:13:01,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:03,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 16:13:03,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:13:08,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:13:08,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:08,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:13:08,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:13:08,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:10,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 16:13:15,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:13:15,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:16,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:16,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:21,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 16:13:22,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:23,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:24,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:13:26,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:13:30,728 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 16:13:30,751 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 16:13:32,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:13:32,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:13:34,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=71626.66666666667, ans=0.125 2023-09-28 16:13:36,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:13:38,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:40,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:45,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:47,223 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 16:13:49,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:13:54,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:13:54,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:13:56,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:00,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:03,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:05,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:14:08,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:10,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:10,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:11,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:14:11,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:11,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 16:14:11,870 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 16:14:11,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:13,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:14:15,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:15,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:15,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 16:14:15,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:14:15,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:14:16,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:16,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:18,161 INFO [train.py:1039] (0/4) Epoch 3, batch 150, loss[loss=0.2964, simple_loss=0.3314, pruned_loss=0.1307, over 23807.00 frames. ], tot_loss[loss=0.3027, simple_loss=0.3467, pruned_loss=0.1293, over 2515536.02 frames. ], batch size: 164, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:14:18,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:19,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:14:19,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:14:23,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:27,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:27,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:14:27,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:32,425 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=71826.66666666667, ans=0.0 2023-09-28 16:14:33,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:33,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:33,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=71893.33333333333, ans=0.125 2023-09-28 16:14:38,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:14:38,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:41,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 16:14:41,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 16:14:41,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 16:14:44,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:14:44,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:14:46,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:14:48,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:48,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:49,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:49,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:50,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=71960.0, ans=0.2 2023-09-28 16:14:52,679 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 16:14:54,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:15:00,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:06,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:15:07,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 16:15:11,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:15:11,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:11,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:13,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:15:15,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:15:15,379 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=72026.66666666667, ans=0.2 2023-09-28 16:15:16,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:15:16,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:18,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 16:15:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:22,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:22,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:15:24,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:15:24,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=72093.33333333333, ans=0.125 2023-09-28 16:15:25,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:28,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 16:15:29,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:15:31,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:15:33,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:34,799 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.108e+02 2.675e+02 3.139e+02 3.901e+02 5.670e+02, threshold=6.278e+02, percent-clipped=0.0 2023-09-28 16:15:37,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:15:37,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 16:15:37,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:38,561 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 16:15:42,127 INFO [train.py:1039] (0/4) Epoch 3, batch 200, loss[loss=0.3213, simple_loss=0.3626, pruned_loss=0.14, over 23372.00 frames. ], tot_loss[loss=0.3038, simple_loss=0.3479, pruned_loss=0.1298, over 3011660.29 frames. ], batch size: 93, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:15:42,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:15:46,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:15:46,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:15:49,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 16:15:51,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:51,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:54,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 16:15:54,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:15:56,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:57,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:02,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:16:02,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:16:02,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:04,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=72226.66666666667, ans=0.125 2023-09-28 16:16:25,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:16:26,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:16:26,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:16:26,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:16:27,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:16:27,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:16:28,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:31,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:16:31,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:31,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:16:31,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=72360.0, ans=0.0 2023-09-28 16:16:33,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 16:16:34,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:16:34,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:37,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:16:46,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:46,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=72426.66666666667, ans=0.125 2023-09-28 16:16:55,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:55,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:17:00,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:01,540 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-09-28 16:17:03,689 INFO [train.py:1039] (0/4) Epoch 3, batch 250, loss[loss=0.2841, simple_loss=0.3146, pruned_loss=0.1268, over 23796.00 frames. ], tot_loss[loss=0.3019, simple_loss=0.3459, pruned_loss=0.129, over 3379537.93 frames. ], batch size: 164, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:17:03,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 16:17:05,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:05,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:17:05,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:06,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:17:07,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 16:17:07,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:17:08,517 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 16:17:10,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:11,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:17:12,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=72493.33333333333, ans=0.2 2023-09-28 16:17:13,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:13,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:15,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:17:16,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:18,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:17:24,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:17:26,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=72560.0, ans=0.0 2023-09-28 16:17:32,132 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.86 vs. limit=10.0 2023-09-28 16:17:34,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:36,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:36,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:17:38,361 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.83 vs. limit=15.0 2023-09-28 16:17:41,521 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.49 vs. limit=22.5 2023-09-28 16:17:42,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:17:42,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:17:44,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:17:45,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:45,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=72626.66666666667, ans=0.125 2023-09-28 16:17:47,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:17:47,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:17:48,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:50,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:17:52,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=72693.33333333333, ans=0.0 2023-09-28 16:17:56,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 16:17:56,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:57,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:17:57,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:17:57,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:17:57,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:18:00,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:18:01,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:18:04,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:05,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:18:06,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:09,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:18:13,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:16,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:18:19,846 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.635e+02 3.105e+02 3.716e+02 7.443e+02, threshold=6.210e+02, percent-clipped=1.0 2023-09-28 16:18:22,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:25,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:18:26,487 INFO [train.py:1039] (0/4) Epoch 3, batch 300, loss[loss=0.293, simple_loss=0.3374, pruned_loss=0.1243, over 23370.00 frames. ], tot_loss[loss=0.2978, simple_loss=0.3422, pruned_loss=0.1267, over 3674628.52 frames. ], batch size: 93, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:18:28,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 16:18:29,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:18:29,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:18:32,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 16:18:32,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:18:34,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:18:34,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 16:18:37,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:40,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:18:42,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:18:42,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 16:18:43,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:45,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:18:45,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 16:18:45,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:18:47,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=72893.33333333333, ans=0.2 2023-09-28 16:18:50,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:18:50,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=72893.33333333333, ans=0.125 2023-09-28 16:18:54,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:18:54,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 16:18:59,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 16:19:01,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:03,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:07,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:07,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 16:19:07,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:19:08,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:19:10,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:19:12,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:15,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:19:15,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 16:19:15,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=73026.66666666667, ans=0.1 2023-09-28 16:19:15,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=73026.66666666667, ans=0.0 2023-09-28 16:19:16,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:19:18,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:20,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 16:19:20,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:24,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:19:28,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:19:28,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 16:19:33,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:33,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:19:36,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:36,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=73093.33333333333, ans=0.5 2023-09-28 16:19:37,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:19:37,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 16:19:37,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:19:38,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:39,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 16:19:41,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:42,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:44,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:44,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:44,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:49,009 INFO [train.py:1039] (0/4) Epoch 3, batch 350, loss[loss=0.3175, simple_loss=0.3463, pruned_loss=0.1443, over 23579.00 frames. ], tot_loss[loss=0.2962, simple_loss=0.3406, pruned_loss=0.1259, over 3902413.14 frames. ], batch size: 256, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:19:49,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:19:49,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:19:50,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:58,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:20:01,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:01,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:04,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 16:20:06,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:06,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 16:20:10,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:11,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 16:20:13,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:14,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 16:20:16,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:20:16,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=73226.66666666667, ans=0.0 2023-09-28 16:20:18,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:19,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:20:21,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:21,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:22,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:20:24,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:20:24,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:31,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:20:31,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:20:32,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:20:34,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:40,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 16:20:40,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:46,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:46,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:20:46,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:47,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 16:20:50,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=73360.0, ans=0.0 2023-09-28 16:20:51,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:20:52,696 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 16:20:52,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 16:20:52,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:57,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:57,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 16:20:59,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=73426.66666666667, ans=0.07 2023-09-28 16:21:00,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:02,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:21:03,441 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.805e+02 2.765e+02 3.239e+02 3.985e+02 6.243e+02, threshold=6.477e+02, percent-clipped=2.0 2023-09-28 16:21:03,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:05,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:05,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:05,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=73426.66666666667, ans=0.1 2023-09-28 16:21:08,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:10,296 INFO [train.py:1039] (0/4) Epoch 3, batch 400, loss[loss=0.3094, simple_loss=0.3514, pruned_loss=0.1337, over 23368.00 frames. ], tot_loss[loss=0.2946, simple_loss=0.3394, pruned_loss=0.1249, over 4087695.09 frames. ], batch size: 105, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:21:10,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:21:13,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:21:15,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 16:21:15,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:15,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:17,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:21:18,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:21,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:21,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=73493.33333333333, ans=0.0 2023-09-28 16:21:22,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:23,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 16:21:25,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 16:21:25,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:26,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 16:21:26,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:21:30,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:30,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 16:21:30,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:21:30,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:31,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:33,325 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 16:21:34,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 16:21:39,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:41,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:42,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 16:21:42,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 16:21:46,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:21:49,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:21:56,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=73626.66666666667, ans=0.0 2023-09-28 16:21:57,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 16:22:00,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:22:03,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 16:22:06,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:22:07,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:22:07,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 16:22:11,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:22:11,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=73693.33333333333, ans=0.0 2023-09-28 16:22:14,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:22:16,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:22:19,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:19,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 16:22:19,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:22:20,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 16:22:21,855 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.73 vs. limit=15.0 2023-09-28 16:22:22,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:22:22,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:22:24,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 16:22:26,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:22:27,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:22:28,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:22:30,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 16:22:30,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:22:32,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:22:32,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:22:32,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 16:22:32,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:22:33,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:22:35,082 INFO [train.py:1039] (0/4) Epoch 3, batch 450, loss[loss=0.2871, simple_loss=0.3273, pruned_loss=0.1235, over 23714.00 frames. ], tot_loss[loss=0.295, simple_loss=0.34, pruned_loss=0.125, over 4222215.57 frames. ], batch size: 232, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:22:36,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:22:41,834 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=73826.66666666667, ans=0.125 2023-09-28 16:22:46,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:46,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:22:48,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 16:22:49,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 16:22:52,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:22:56,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:59,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:05,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:05,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=73893.33333333333, ans=0.125 2023-09-28 16:23:06,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:07,520 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.26 vs. limit=15.0 2023-09-28 16:23:09,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 16:23:09,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 16:23:11,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 16:23:11,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:13,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:14,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:23:16,026 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 16:23:16,040 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 16:23:16,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:23:16,767 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.05 vs. limit=15.0 2023-09-28 16:23:17,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:23:19,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:23:21,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=73960.0, ans=0.1 2023-09-28 16:23:22,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:23:22,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:23:24,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:23:24,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 16:23:26,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:29,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:23:29,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:23:32,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 16:23:36,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:23:38,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 16:23:40,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 16:23:41,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:45,002 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=74093.33333333333, ans=0.125 2023-09-28 16:23:46,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:23:47,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:23:49,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:23:51,050 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 16:23:52,495 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.112e+02 2.606e+02 2.993e+02 3.540e+02 4.868e+02, threshold=5.986e+02, percent-clipped=0.0 2023-09-28 16:23:54,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:54,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:23:54,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:54,780 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 16:23:57,556 INFO [train.py:1039] (0/4) Epoch 3, batch 500, loss[loss=0.3075, simple_loss=0.3443, pruned_loss=0.1353, over 23833.00 frames. ], tot_loss[loss=0.2967, simple_loss=0.3412, pruned_loss=0.1261, over 4337738.78 frames. ], batch size: 212, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:23:57,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 16:23:57,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:00,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:24:05,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:24:07,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:24:10,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:24:11,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:24:11,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:19,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=74226.66666666667, ans=0.1 2023-09-28 16:24:21,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:22,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:24:22,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:24:22,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:24,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 16:24:24,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:24:27,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:24:28,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:24:28,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:24:30,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:30,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 16:24:35,583 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 16:24:37,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:24:38,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:38,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:24:41,251 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.81 vs. limit=22.5 2023-09-28 16:24:44,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 16:24:47,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:24:47,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:24:50,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=74360.0, ans=0.1 2023-09-28 16:24:53,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:54,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:59,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:04,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 16:25:04,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:04,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:10,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 16:25:10,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:25:12,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:16,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 16:25:18,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 16:25:18,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:19,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 16:25:20,335 INFO [train.py:1039] (0/4) Epoch 3, batch 550, loss[loss=0.3139, simple_loss=0.36, pruned_loss=0.1339, over 23379.00 frames. ], tot_loss[loss=0.2996, simple_loss=0.3438, pruned_loss=0.1278, over 4415206.78 frames. ], batch size: 93, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:25:20,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:25:20,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:21,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:25:25,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:25:28,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:30,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 16:25:30,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:25:34,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:36,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:39,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:25:40,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:44,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 16:25:44,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 16:25:47,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:25:52,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:25:52,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:25:55,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:25:56,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=74626.66666666667, ans=0.07 2023-09-28 16:25:58,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:58,258 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 16:26:00,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:26:02,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:26:02,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=74626.66666666667, ans=0.125 2023-09-28 16:26:05,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:26:06,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:26:06,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:26:08,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:09,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 16:26:11,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 16:26:12,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:12,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:26:14,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:26:14,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:26:17,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:26:18,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:26:22,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:26:22,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:23,080 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=74693.33333333333, ans=0.0 2023-09-28 16:26:24,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 16:26:24,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:26:27,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:28,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:26:30,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:32,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:26:32,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:26:36,869 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.890e+02 2.622e+02 3.187e+02 4.101e+02 6.995e+02, threshold=6.373e+02, percent-clipped=4.0 2023-09-28 16:26:38,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 16:26:41,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 16:26:42,518 INFO [train.py:1039] (0/4) Epoch 3, batch 600, loss[loss=0.3136, simple_loss=0.356, pruned_loss=0.1356, over 23864.00 frames. ], tot_loss[loss=0.3003, simple_loss=0.3442, pruned_loss=0.1282, over 4474522.85 frames. ], batch size: 164, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:26:42,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:26:44,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:26:44,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:50,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:26:51,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:26:53,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 16:26:56,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:26:58,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:26:58,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:00,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=74893.33333333333, ans=0.125 2023-09-28 16:27:02,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 16:27:02,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:27:11,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 16:27:12,146 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.07 vs. limit=22.5 2023-09-28 16:27:14,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:27:14,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:14,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:27:19,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:27:21,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:27:21,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:27,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:27:33,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:33,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:27:33,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:42,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 16:27:48,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:27:48,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:27:53,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 16:27:53,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:27:56,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 16:27:56,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:27:58,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:28:03,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 16:28:04,673 INFO [train.py:1039] (0/4) Epoch 3, batch 650, loss[loss=0.2855, simple_loss=0.3475, pruned_loss=0.1117, over 24359.00 frames. ], tot_loss[loss=0.297, simple_loss=0.3418, pruned_loss=0.1261, over 4526939.15 frames. ], batch size: 77, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:28:04,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:28:06,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:09,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:28:11,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:14,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 16:28:15,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:28:20,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:28:20,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:21,253 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=75226.66666666667, ans=0.2 2023-09-28 16:28:24,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:27,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 16:28:29,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=75226.66666666667, ans=0.05 2023-09-28 16:28:30,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:28:32,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:35,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:28:36,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:28:38,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:39,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:39,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:28:41,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:43,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:28:43,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=75293.33333333333, ans=0.0 2023-09-28 16:28:45,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:28:45,340 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 16:28:45,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:45,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:28:45,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=75293.33333333333, ans=0.0 2023-09-28 16:28:47,133 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=75293.33333333333, ans=0.0 2023-09-28 16:28:48,538 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=75293.33333333333, ans=0.0 2023-09-28 16:28:50,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:51,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:53,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:28:53,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:28:55,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 16:28:56,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:28:56,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:58,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:28:58,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:59,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:29:01,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 16:29:01,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=75360.0, ans=0.125 2023-09-28 16:29:03,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 16:29:03,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:03,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:29:03,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:29:04,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:29:04,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:29:11,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:11,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:12,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:29:15,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:15,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:29:16,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:22,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=75426.66666666667, ans=0.125 2023-09-28 16:29:23,201 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.199e+02 2.685e+02 3.190e+02 3.569e+02 4.758e+02, threshold=6.380e+02, percent-clipped=0.0 2023-09-28 16:29:23,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:29:23,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:24,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:24,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:27,705 INFO [train.py:1039] (0/4) Epoch 3, batch 700, loss[loss=0.3048, simple_loss=0.3432, pruned_loss=0.1332, over 23652.00 frames. ], tot_loss[loss=0.2955, simple_loss=0.3398, pruned_loss=0.1256, over 4559452.61 frames. ], batch size: 149, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:29:30,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 16:29:31,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 16:29:34,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 16:29:34,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:36,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:29:39,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 16:29:40,382 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=26.70 vs. limit=22.5 2023-09-28 16:29:44,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:46,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:29:46,376 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=75560.0, ans=0.0 2023-09-28 16:29:47,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:47,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=75560.0, ans=0.2 2023-09-28 16:29:49,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:29:50,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:54,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:56,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:29:57,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:29:59,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 16:30:03,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 16:30:06,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:30:06,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:30:08,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:30:11,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:30:11,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 16:30:15,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=75626.66666666667, ans=0.1 2023-09-28 16:30:16,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:17,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:30:17,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 16:30:22,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:30:23,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:27,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:30:32,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:30:34,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 16:30:36,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 16:30:36,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 16:30:38,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=75760.0, ans=0.1 2023-09-28 16:30:40,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:42,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:30:44,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:30:44,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:44,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 16:30:50,484 INFO [train.py:1039] (0/4) Epoch 3, batch 750, loss[loss=0.2937, simple_loss=0.3301, pruned_loss=0.1287, over 23654.00 frames. ], tot_loss[loss=0.2936, simple_loss=0.3391, pruned_loss=0.124, over 4604021.36 frames. ], batch size: 135, lr: 3.11e-02, grad_scale: 16.0 2023-09-28 16:30:50,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 16:30:50,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 16:30:50,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 16:30:52,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 16:30:52,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 16:30:53,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:30:55,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 16:30:56,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:58,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:31:00,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:01,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:01,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:31:02,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:05,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:31:06,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:31:06,989 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:31:08,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:31:11,127 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=75893.33333333333, ans=0.125 2023-09-28 16:31:12,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:12,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:14,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 16:31:15,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:31:15,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:19,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:20,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:31:20,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=75893.33333333333, ans=0.035 2023-09-28 16:31:22,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 16:31:22,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:31:23,972 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=75960.0, ans=0.0 2023-09-28 16:31:26,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 16:31:26,636 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 16:31:28,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 16:31:28,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:31:28,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:31:29,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=75960.0, ans=0.125 2023-09-28 16:31:31,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:31:36,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=75960.0, ans=0.0 2023-09-28 16:31:38,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:31:38,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:38,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:31:41,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:42,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:44,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 16:31:44,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:31:47,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 16:31:49,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:31:52,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:31:53,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 16:31:53,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:55,590 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.40 vs. limit=15.0 2023-09-28 16:31:56,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:31:57,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=76093.33333333333, ans=0.0 2023-09-28 16:31:58,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:31:59,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:01,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:32:03,914 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=24.17 vs. limit=15.0 2023-09-28 16:32:04,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 16:32:04,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:04,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=76093.33333333333, ans=0.125 2023-09-28 16:32:06,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:07,083 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.20 vs. limit=22.5 2023-09-28 16:32:07,687 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.054e+02 2.659e+02 2.971e+02 3.538e+02 5.180e+02, threshold=5.942e+02, percent-clipped=0.0 2023-09-28 16:32:07,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:07,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:10,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:10,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:32:12,828 INFO [train.py:1039] (0/4) Epoch 3, batch 800, loss[loss=0.3181, simple_loss=0.3626, pruned_loss=0.1368, over 23446.00 frames. ], tot_loss[loss=0.294, simple_loss=0.3395, pruned_loss=0.1243, over 4619552.30 frames. ], batch size: 93, lr: 3.11e-02, grad_scale: 32.0 2023-09-28 16:32:23,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:23,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:25,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:25,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:26,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:26,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:28,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:31,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:33,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:32:36,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 16:32:37,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:39,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:39,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:32:39,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:39,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 16:32:41,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:41,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 16:32:45,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:47,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=76293.33333333333, ans=0.1 2023-09-28 16:32:48,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:50,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:50,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:55,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:56,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:33:00,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:01,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:33:01,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 16:33:02,915 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 16:33:02,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 16:33:04,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:33:04,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:06,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:06,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:12,684 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 16:33:12,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 16:33:14,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:33:15,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:33:18,147 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.01 vs. limit=15.0 2023-09-28 16:33:20,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:33:23,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:33:25,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 16:33:25,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:33:30,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 16:33:33,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:35,166 INFO [train.py:1039] (0/4) Epoch 3, batch 850, loss[loss=0.3203, simple_loss=0.3749, pruned_loss=0.1328, over 24579.00 frames. ], tot_loss[loss=0.294, simple_loss=0.3398, pruned_loss=0.1241, over 4650263.57 frames. ], batch size: 71, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:33:37,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:33:38,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 16:33:38,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:33:40,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:42,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 16:33:43,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:43,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:33:45,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:46,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:33:48,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:50,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 16:33:50,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 16:33:50,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 16:33:51,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:53,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:54,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:54,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:54,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:33:58,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:58,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:00,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 16:34:04,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 16:34:05,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:34:09,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 16:34:12,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 16:34:13,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 16:34:15,967 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 16:34:15,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:15,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:34:16,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:34:18,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 16:34:23,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:25,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:25,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:34:25,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:34:28,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:34:29,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:34:29,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 16:34:35,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:34:35,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:35,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:34:35,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:34:37,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:37,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=76693.33333333333, ans=0.0 2023-09-28 16:34:40,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:42,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=76760.0, ans=0.1 2023-09-28 16:34:43,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:34:44,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:34:46,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:34:47,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:34:51,729 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.00 vs. limit=15.0 2023-09-28 16:34:53,428 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.514e+02 2.970e+02 3.562e+02 5.095e+02, threshold=5.941e+02, percent-clipped=0.0 2023-09-28 16:34:55,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:34:55,623 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.24 vs. limit=15.0 2023-09-28 16:34:57,903 INFO [train.py:1039] (0/4) Epoch 3, batch 900, loss[loss=0.3063, simple_loss=0.3398, pruned_loss=0.1364, over 23806.00 frames. ], tot_loss[loss=0.2947, simple_loss=0.34, pruned_loss=0.1247, over 4669621.55 frames. ], batch size: 195, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:34:57,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:58,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 16:34:58,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:34:59,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:35:01,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 16:35:07,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:35:10,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:12,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 16:35:14,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=76893.33333333333, ans=0.0 2023-09-28 16:35:17,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:35:17,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 16:35:18,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:35:19,224 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.32 vs. limit=15.0 2023-09-28 16:35:19,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:35:19,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:19,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:35:19,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:35:20,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=76893.33333333333, ans=0.0 2023-09-28 16:35:31,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:35:31,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:31,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:35:34,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:39,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 16:35:41,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:35:46,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:35:46,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:35:48,189 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 16:35:48,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 16:35:53,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=77026.66666666667, ans=0.2 2023-09-28 16:35:55,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:35:55,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:35:55,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:36:01,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:01,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:02,153 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=77026.66666666667, ans=0.125 2023-09-28 16:36:03,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 16:36:03,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:36:07,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 16:36:09,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:36:09,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:11,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:36:11,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:16,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 16:36:16,455 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 16:36:19,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:36:19,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 16:36:19,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=77093.33333333333, ans=0.09899494936611666 2023-09-28 16:36:22,429 INFO [train.py:1039] (0/4) Epoch 3, batch 950, loss[loss=0.3051, simple_loss=0.3315, pruned_loss=0.1393, over 23421.00 frames. ], tot_loss[loss=0.295, simple_loss=0.3407, pruned_loss=0.1247, over 4679018.25 frames. ], batch size: 285, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:36:22,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:27,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 16:36:31,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:32,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:33,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:34,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:36:36,653 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 16:36:39,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:40,039 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=77226.66666666667, ans=0.125 2023-09-28 16:36:41,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:36:42,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:42,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:36:42,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 16:36:44,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:36:47,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:47,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 16:36:49,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:50,035 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=12.0 2023-09-28 16:36:52,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:52,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:52,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:54,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 16:36:54,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=77293.33333333333, ans=0.125 2023-09-28 16:36:56,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=77293.33333333333, ans=0.125 2023-09-28 16:36:57,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:36:59,113 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=77293.33333333333, ans=0.1 2023-09-28 16:37:00,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:37:03,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:37:05,832 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:37:07,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:07,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:37:10,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 16:37:15,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:37:15,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:37:15,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:15,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:15,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:37:17,513 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.95 vs. limit=22.5 2023-09-28 16:37:21,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 16:37:21,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:37:25,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:26,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:26,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 16:37:26,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:26,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:37:26,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 16:37:33,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:37:35,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:40,046 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.993e+02 2.741e+02 3.253e+02 3.972e+02 7.741e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 16:37:40,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:37:43,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 16:37:43,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 16:37:45,292 INFO [train.py:1039] (0/4) Epoch 3, batch 1000, loss[loss=0.2826, simple_loss=0.3376, pruned_loss=0.1138, over 24676.00 frames. ], tot_loss[loss=0.2943, simple_loss=0.3397, pruned_loss=0.1245, over 4685037.00 frames. ], batch size: 65, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:37:47,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:50,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 16:37:50,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:53,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:37:56,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 16:37:56,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 16:38:01,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:02,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:38:02,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:03,153 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=77560.0, ans=0.1 2023-09-28 16:38:07,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 16:38:12,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 16:38:13,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 16:38:13,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:16,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 16:38:18,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 16:38:19,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 16:38:20,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:20,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:25,888 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.87 vs. limit=15.0 2023-09-28 16:38:25,999 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=5.09 vs. limit=15.0 2023-09-28 16:38:28,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:29,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:38:29,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:29,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:29,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 16:38:30,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:31,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:38:33,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:33,641 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 16:38:35,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=77693.33333333333, ans=0.125 2023-09-28 16:38:36,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 16:38:38,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 16:38:39,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 16:38:43,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:38:50,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:50,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:38:50,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:51,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:38:53,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 16:38:54,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:38:54,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 16:38:55,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 16:38:57,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:38:57,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:39:00,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:39:02,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:39:04,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:08,081 INFO [train.py:1039] (0/4) Epoch 3, batch 1050, loss[loss=0.2487, simple_loss=0.306, pruned_loss=0.09571, over 24609.00 frames. ], tot_loss[loss=0.2914, simple_loss=0.3372, pruned_loss=0.1228, over 4690370.06 frames. ], batch size: 60, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:39:09,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:39:09,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:39:11,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:39:13,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:14,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:16,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:39:18,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=77826.66666666667, ans=0.125 2023-09-28 16:39:19,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:39:19,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=77826.66666666667, ans=0.125 2023-09-28 16:39:22,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:39:22,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:39:22,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:39:24,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:39:25,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 16:39:26,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:26,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 16:39:28,286 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.89 vs. limit=15.0 2023-09-28 16:39:29,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:39:29,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 16:39:29,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:39:35,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:36,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:39:36,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:40,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 16:39:40,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 16:39:40,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:45,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 16:39:47,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 16:39:48,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:52,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:39:53,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:39:53,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:39:55,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:39:58,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:40:03,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 16:40:03,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 16:40:03,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 16:40:03,797 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=78026.66666666667, ans=0.125 2023-09-28 16:40:05,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:05,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:40:08,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 16:40:14,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:40:15,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:15,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:15,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:15,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:19,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:19,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 16:40:20,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:20,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 16:40:22,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 16:40:22,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:40:25,628 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.910e+02 2.729e+02 3.108e+02 3.500e+02 5.269e+02, threshold=6.215e+02, percent-clipped=0.0 2023-09-28 16:40:25,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:40:27,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=78093.33333333333, ans=0.1 2023-09-28 16:40:27,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=78093.33333333333, ans=0.125 2023-09-28 16:40:31,240 INFO [train.py:1039] (0/4) Epoch 3, batch 1100, loss[loss=0.2898, simple_loss=0.3339, pruned_loss=0.1229, over 16194.00 frames. ], tot_loss[loss=0.2905, simple_loss=0.3361, pruned_loss=0.1225, over 4678952.86 frames. ], batch size: 35, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:40:31,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:40:37,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:40:38,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:40:38,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:40,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 16:40:40,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:40:45,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:40:49,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:40:53,138 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=7.09 vs. limit=12.0 2023-09-28 16:40:53,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:40:53,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 16:40:55,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:40:57,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:57,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:59,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:41:01,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:41:06,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:41:09,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 16:41:09,266 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 16:41:10,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:12,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:13,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:41:13,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:41:14,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=78293.33333333333, ans=0.125 2023-09-28 16:41:16,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 16:41:16,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:41:16,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:41:16,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:41:17,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:19,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 16:41:23,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:41:23,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 16:41:27,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:41:32,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:41:35,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 16:41:36,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:41:38,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:39,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=78426.66666666667, ans=0.2 2023-09-28 16:41:40,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:41:40,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:43,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 16:41:43,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:41:43,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:45,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 16:41:45,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:41:45,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 16:41:47,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:41:47,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:41:49,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:41:53,574 INFO [train.py:1039] (0/4) Epoch 3, batch 1150, loss[loss=0.315, simple_loss=0.3614, pruned_loss=0.1343, over 24563.00 frames. ], tot_loss[loss=0.2921, simple_loss=0.3376, pruned_loss=0.1234, over 4687816.43 frames. ], batch size: 71, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:41:55,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:41:55,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=78493.33333333333, ans=0.125 2023-09-28 16:41:58,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:42:00,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:00,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:42:01,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 16:42:01,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:04,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 16:42:04,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:04,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:42:05,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=78493.33333333333, ans=0.1 2023-09-28 16:42:06,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=78493.33333333333, ans=0.125 2023-09-28 16:42:13,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 16:42:16,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:19,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:21,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:21,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 16:42:21,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:42:21,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:24,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 16:42:26,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:28,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:37,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=78626.66666666667, ans=0.125 2023-09-28 16:42:38,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:43,462 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:42:46,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:46,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 16:42:48,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:48,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:50,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=78693.33333333333, ans=0.125 2023-09-28 16:42:54,757 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 16:42:56,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:04,580 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 16:43:07,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:09,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:43:09,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:43:11,193 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.090e+02 2.632e+02 2.933e+02 3.650e+02 8.073e+02, threshold=5.867e+02, percent-clipped=1.0 2023-09-28 16:43:11,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:43:14,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:14,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=78826.66666666667, ans=0.1 2023-09-28 16:43:15,737 INFO [train.py:1039] (0/4) Epoch 3, batch 1200, loss[loss=0.2985, simple_loss=0.359, pruned_loss=0.119, over 24648.00 frames. ], tot_loss[loss=0.2923, simple_loss=0.3378, pruned_loss=0.1233, over 4692007.62 frames. ], batch size: 68, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:43:21,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:43:21,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:43:22,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:22,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:22,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:43:25,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:43:27,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:43:29,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:29,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:32,586 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 16:43:35,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 16:43:39,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:43:42,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:43:45,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:45,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:43:45,423 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 16:43:46,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:48,871 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=78960.0, ans=0.0 2023-09-28 16:43:54,610 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:43:55,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:43:55,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:43:55,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 16:43:57,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:44:02,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 16:44:05,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 16:44:05,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:44:07,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:44:08,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:08,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:44:11,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:44:11,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:44:12,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:44:13,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 16:44:13,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:44:13,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:13,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:44:18,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:18,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:23,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:44:25,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:44:26,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 16:44:30,601 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 16:44:32,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:44:35,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:36,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:44:38,699 INFO [train.py:1039] (0/4) Epoch 3, batch 1250, loss[loss=0.3357, simple_loss=0.3608, pruned_loss=0.1553, over 23851.00 frames. ], tot_loss[loss=0.2929, simple_loss=0.3389, pruned_loss=0.1234, over 4710588.10 frames. ], batch size: 179, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:44:38,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:40,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 16:44:45,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:44:47,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:47,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 16:44:50,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:44:50,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:44:55,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:44:55,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:57,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:44:57,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:01,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.39 vs. limit=22.5 2023-09-28 16:45:02,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:45:07,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:45:07,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:45:07,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:08,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:08,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:09,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=79226.66666666667, ans=0.2 2023-09-28 16:45:12,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:13,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:45:16,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=79293.33333333333, ans=0.0 2023-09-28 16:45:20,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 16:45:20,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:45:25,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:26,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 16:45:26,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:45:26,801 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 16:45:26,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:26,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:30,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:30,683 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.69 vs. limit=15.0 2023-09-28 16:45:35,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:35,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:45:37,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 16:45:37,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 16:45:37,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 16:45:41,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:45:43,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 16:45:43,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:47,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:45:47,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:45:50,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 16:45:51,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:45:52,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:45:52,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:45:53,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:55,646 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.589e+02 2.905e+02 3.561e+02 6.488e+02, threshold=5.810e+02, percent-clipped=2.0 2023-09-28 16:45:55,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 16:45:58,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:58,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:45:59,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:46:01,402 INFO [train.py:1039] (0/4) Epoch 3, batch 1300, loss[loss=0.2816, simple_loss=0.323, pruned_loss=0.1202, over 23766.00 frames. ], tot_loss[loss=0.2922, simple_loss=0.3394, pruned_loss=0.1226, over 4723809.35 frames. ], batch size: 164, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:46:02,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:46:03,344 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=79493.33333333333, ans=0.125 2023-09-28 16:46:06,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:46:06,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 16:46:12,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:14,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:46:14,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:17,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:46:18,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:46:18,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 16:46:24,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:46:24,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:46:25,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 16:46:25,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=79560.0, ans=0.125 2023-09-28 16:46:30,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:46:31,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=79560.0, ans=0.125 2023-09-28 16:46:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:34,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:46:35,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:35,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:37,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:46:37,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:46:38,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 16:46:45,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:46:45,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:46:47,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 16:46:47,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:46:50,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:46:53,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:53,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 16:46:54,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:54,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 16:46:56,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:59,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:47:00,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:47:00,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=79693.33333333333, ans=0.0 2023-09-28 16:47:03,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 16:47:04,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 16:47:04,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 16:47:09,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:47:11,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 16:47:15,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:22,593 INFO [train.py:1039] (0/4) Epoch 3, batch 1350, loss[loss=0.3226, simple_loss=0.3657, pruned_loss=0.1397, over 23662.00 frames. ], tot_loss[loss=0.292, simple_loss=0.3386, pruned_loss=0.1227, over 4716765.01 frames. ], batch size: 135, lr: 3.05e-02, grad_scale: 32.0 2023-09-28 16:47:22,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 16:47:28,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:28,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:32,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:32,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:36,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:47:36,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:40,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:40,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 16:47:44,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:47:45,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:47:49,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 16:47:50,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:47:51,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:47:51,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 16:47:52,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 16:47:55,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 16:47:57,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:57,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 16:48:02,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=79960.0, ans=0.0 2023-09-28 16:48:03,609 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-12000.pt 2023-09-28 16:48:11,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:20,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:22,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:22,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 16:48:26,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:26,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=80026.66666666667, ans=0.125 2023-09-28 16:48:27,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 16:48:27,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:48:27,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=80026.66666666667, ans=0.2 2023-09-28 16:48:28,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:48:30,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:48:33,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 16:48:35,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:48:35,423 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=80093.33333333333, ans=0.09899494936611666 2023-09-28 16:48:37,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=80093.33333333333, ans=0.0 2023-09-28 16:48:41,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 16:48:42,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 16:48:44,348 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.017e+02 2.667e+02 3.027e+02 3.668e+02 6.120e+02, threshold=6.055e+02, percent-clipped=2.0 2023-09-28 16:48:48,486 INFO [train.py:1039] (0/4) Epoch 3, batch 1400, loss[loss=0.2471, simple_loss=0.3081, pruned_loss=0.09304, over 24265.00 frames. ], tot_loss[loss=0.2895, simple_loss=0.3372, pruned_loss=0.1209, over 4736031.20 frames. ], batch size: 61, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:48:48,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 16:48:50,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:52,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=80160.0, ans=0.1 2023-09-28 16:48:54,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=80160.0, ans=0.0 2023-09-28 16:48:55,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:48:55,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:49:00,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 16:49:02,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 16:49:02,715 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.46 vs. limit=22.5 2023-09-28 16:49:05,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=80226.66666666667, ans=0.1 2023-09-28 16:49:11,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:49:12,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:14,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:49:14,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:49:17,303 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.94 vs. limit=15.0 2023-09-28 16:49:18,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:49:20,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:49:30,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:30,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:31,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=80293.33333333333, ans=0.125 2023-09-28 16:49:34,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 16:49:34,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:49:34,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:49:36,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:49:37,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:39,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:49:39,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:49:39,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:49:40,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 16:49:40,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:49:45,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:52,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:50:00,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 16:50:02,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:50:03,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:50:06,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:50:07,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:07,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:50:10,900 INFO [train.py:1039] (0/4) Epoch 3, batch 1450, loss[loss=0.2825, simple_loss=0.3413, pruned_loss=0.1119, over 23972.00 frames. ], tot_loss[loss=0.2877, simple_loss=0.3356, pruned_loss=0.1199, over 4727171.06 frames. ], batch size: 86, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:50:12,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:50:12,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:50:12,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:14,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:50:18,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:19,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:50:20,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:50:20,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 16:50:22,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:50:23,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 16:50:23,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:24,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=80493.33333333333, ans=0.0 2023-09-28 16:50:26,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:26,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 16:50:27,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:27,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:50:28,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 16:50:28,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:29,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:50:30,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:33,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:38,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:50:38,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:50:40,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:40,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:42,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=80626.66666666667, ans=0.125 2023-09-28 16:50:43,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:43,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:50:43,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:45,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:50:48,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 16:50:51,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:54,247 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 16:50:56,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:50:56,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:50:57,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:01,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 16:51:06,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:07,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 16:51:09,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 16:51:11,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:11,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=80693.33333333333, ans=0.07 2023-09-28 16:51:13,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=80693.33333333333, ans=0.125 2023-09-28 16:51:14,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:14,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:51:15,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 16:51:18,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 16:51:18,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 16:51:19,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:21,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:51:28,718 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 2.628e+02 3.276e+02 3.890e+02 6.376e+02, threshold=6.552e+02, percent-clipped=1.0 2023-09-28 16:51:32,272 INFO [train.py:1039] (0/4) Epoch 3, batch 1500, loss[loss=0.279, simple_loss=0.3364, pruned_loss=0.1108, over 23214.00 frames. ], tot_loss[loss=0.289, simple_loss=0.3363, pruned_loss=0.1208, over 4725438.74 frames. ], batch size: 93, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:51:35,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 16:51:35,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:51:35,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:51:35,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:37,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:39,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:51:39,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 16:51:43,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:51:43,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:51:43,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:44,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:45,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=80826.66666666667, ans=0.125 2023-09-28 16:51:46,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:51:46,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 16:51:53,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:51:54,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:51:54,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:58,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 16:52:02,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 16:52:04,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:52:05,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 16:52:07,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:52:12,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:12,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:52:13,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:52:15,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 16:52:15,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:52:15,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:15,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 16:52:16,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:22,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:52:22,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 16:52:22,469 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=81026.66666666667, ans=0.0 2023-09-28 16:52:28,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:52:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:52:36,341 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 16:52:37,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:37,832 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 16:52:39,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:52:42,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:52:44,317 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 16:52:44,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:52:47,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 16:52:49,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:52,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:54,273 INFO [train.py:1039] (0/4) Epoch 3, batch 1550, loss[loss=0.3093, simple_loss=0.3445, pruned_loss=0.1371, over 23270.00 frames. ], tot_loss[loss=0.2896, simple_loss=0.337, pruned_loss=0.1211, over 4730577.78 frames. ], batch size: 119, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:52:54,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:54,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 16:52:57,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 16:52:57,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:52:59,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 16:53:00,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 16:53:03,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:04,774 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=81160.0, ans=0.0 2023-09-28 16:53:06,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:06,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:06,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:53:07,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:09,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:12,064 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 16:53:12,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:12,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:53:13,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:53:16,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:53:16,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 16:53:18,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:18,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 16:53:20,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 16:53:20,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 16:53:21,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:23,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:24,286 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.89 vs. limit=22.5 2023-09-28 16:53:25,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:53:25,902 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=16.40 vs. limit=15.0 2023-09-28 16:53:29,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 16:53:29,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 16:53:38,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:39,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=81293.33333333333, ans=0.09899494936611666 2023-09-28 16:53:42,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:42,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:53:42,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:53:44,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 16:53:45,959 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=81360.0, ans=0.2 2023-09-28 16:53:48,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:53:50,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:53,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:53:55,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:53:55,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:55,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 16:53:55,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:54:00,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:54:00,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:00,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 16:54:00,574 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 16:54:03,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:08,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 16:54:10,342 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=81426.66666666667, ans=0.1 2023-09-28 16:54:13,778 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.706e+02 3.074e+02 3.869e+02 6.821e+02, threshold=6.147e+02, percent-clipped=1.0 2023-09-28 16:54:14,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:15,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:16,855 INFO [train.py:1039] (0/4) Epoch 3, batch 1600, loss[loss=0.2931, simple_loss=0.3549, pruned_loss=0.1157, over 24401.00 frames. ], tot_loss[loss=0.2923, simple_loss=0.3391, pruned_loss=0.1228, over 4726043.97 frames. ], batch size: 77, lr: 3.03e-02, grad_scale: 32.0 2023-09-28 16:54:16,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 16:54:18,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:54:19,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:19,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:54:20,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:54:21,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:54:23,496 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:54:24,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:24,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 16:54:26,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 16:54:26,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=81493.33333333333, ans=0.125 2023-09-28 16:54:28,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 16:54:31,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:54:32,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 16:54:33,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:54:36,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:54:41,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:54:44,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 16:54:47,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:54:48,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 16:54:49,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:49,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 16:54:54,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 16:55:03,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:03,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 16:55:03,557 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=81626.66666666667, ans=0.2 2023-09-28 16:55:03,901 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.93 vs. limit=12.0 2023-09-28 16:55:05,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:05,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:05,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:55:08,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 16:55:10,249 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=81693.33333333333, ans=0.07 2023-09-28 16:55:11,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 16:55:12,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:55:13,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:15,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:15,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:55:16,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=81693.33333333333, ans=0.125 2023-09-28 16:55:18,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:55:18,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:55:21,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:55:27,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:29,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:55:30,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 16:55:30,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:55:33,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 16:55:37,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:39,161 INFO [train.py:1039] (0/4) Epoch 3, batch 1650, loss[loss=0.3138, simple_loss=0.361, pruned_loss=0.1333, over 24385.00 frames. ], tot_loss[loss=0.2921, simple_loss=0.3386, pruned_loss=0.1228, over 4728893.22 frames. ], batch size: 77, lr: 3.03e-02, grad_scale: 16.0 2023-09-28 16:55:40,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:55:43,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:55:43,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 16:55:43,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 16:55:43,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 16:55:43,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 16:55:46,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:48,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:48,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:55:48,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:55:48,793 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=81826.66666666667, ans=0.125 2023-09-28 16:55:51,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:53,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 16:55:55,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:55,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:55,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:55:55,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:55:56,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 16:55:58,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 16:56:02,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:56:04,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:56:14,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 16:56:16,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:17,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 16:56:21,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:24,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:56:24,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:56:24,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:26,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:56:26,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:29,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:56:29,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:31,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:31,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:32,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:33,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:56:37,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:39,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 16:56:41,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:42,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 16:56:42,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 16:56:42,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 16:56:42,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:43,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=82093.33333333333, ans=0.2 2023-09-28 16:56:44,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:56:45,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:48,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:48,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 16:56:51,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:52,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:56:53,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:56,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 16:56:59,512 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.428e+02 2.816e+02 3.293e+02 5.315e+02, threshold=5.632e+02, percent-clipped=0.0 2023-09-28 16:57:01,806 INFO [train.py:1039] (0/4) Epoch 3, batch 1700, loss[loss=0.284, simple_loss=0.3353, pruned_loss=0.1164, over 24573.00 frames. ], tot_loss[loss=0.2913, simple_loss=0.3376, pruned_loss=0.1225, over 4722610.48 frames. ], batch size: 60, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:57:01,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:57:01,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:57:01,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 16:57:02,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:02,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:57:02,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:06,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:57:06,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:57:06,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 16:57:06,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=82160.0, ans=0.05 2023-09-28 16:57:08,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:57:13,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=82160.0, ans=0.2 2023-09-28 16:57:16,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:20,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:57:27,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:57:27,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:57:27,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:28,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:57:32,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 16:57:34,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:57:35,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:37,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:57:37,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:57:39,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 16:57:40,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 16:57:43,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:45,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 16:57:45,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:57:53,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:57:54,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=82360.0, ans=0.0 2023-09-28 16:57:57,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:57:57,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:58:00,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:58:00,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 16:58:00,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:58:01,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:01,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 16:58:03,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:03,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:03,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:03,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:05,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:05,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:58:07,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:07,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=82426.66666666667, ans=0.5 2023-09-28 16:58:09,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:58:11,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:14,956 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.51 vs. limit=15.0 2023-09-28 16:58:15,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:15,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 16:58:18,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:20,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:20,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=82426.66666666667, ans=0.0 2023-09-28 16:58:21,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 16:58:23,947 INFO [train.py:1039] (0/4) Epoch 3, batch 1750, loss[loss=0.3384, simple_loss=0.3647, pruned_loss=0.156, over 23796.00 frames. ], tot_loss[loss=0.2903, simple_loss=0.3369, pruned_loss=0.1219, over 4728401.79 frames. ], batch size: 179, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:58:28,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:32,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:32,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:58:32,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 16:58:32,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:35,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:58:37,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:42,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 16:58:43,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:46,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 16:58:46,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:48,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:58:51,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 16:58:53,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 16:58:55,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:56,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 16:59:05,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:59:07,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:07,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:10,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:11,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:13,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:59:14,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:17,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:19,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:59:20,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 16:59:23,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:25,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 16:59:26,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:28,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:28,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:59:32,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:59:33,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:59:35,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:36,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:41,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:43,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:59:44,782 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.581e+02 2.939e+02 3.799e+02 7.676e+02, threshold=5.877e+02, percent-clipped=7.0 2023-09-28 16:59:46,364 INFO [train.py:1039] (0/4) Epoch 3, batch 1800, loss[loss=0.2913, simple_loss=0.33, pruned_loss=0.1263, over 23836.00 frames. ], tot_loss[loss=0.2876, simple_loss=0.3341, pruned_loss=0.1205, over 4720662.60 frames. ], batch size: 179, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 16:59:46,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:59:46,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 16:59:46,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:48,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:59:48,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:59:48,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:59:48,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:59:49,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:59:51,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:59:53,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:55,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:59:56,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:59,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:00:01,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:05,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:08,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:00:11,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:00:12,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 17:00:12,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:16,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:21,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 17:00:23,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 17:00:23,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 17:00:23,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:24,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:24,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:00:25,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=82960.0, ans=0.0 2023-09-28 17:00:26,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:00:31,532 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=82960.0, ans=0.0 2023-09-28 17:00:32,921 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 17:00:33,887 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.11 vs. limit=6.0 2023-09-28 17:00:34,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:00:34,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=83026.66666666667, ans=0.1 2023-09-28 17:00:37,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:37,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 17:00:39,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 17:00:39,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:00:39,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:00:41,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:00:46,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 17:00:51,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:00:52,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 17:00:52,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:00:52,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:52,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:00:54,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 17:00:58,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:00:58,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:58,433 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=83093.33333333333, ans=0.0 2023-09-28 17:00:59,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 17:00:59,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:02,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:02,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:01:02,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:01:06,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:01:06,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:09,494 INFO [train.py:1039] (0/4) Epoch 3, batch 1850, loss[loss=0.2694, simple_loss=0.3174, pruned_loss=0.1108, over 24427.00 frames. ], tot_loss[loss=0.289, simple_loss=0.3351, pruned_loss=0.1214, over 4714637.95 frames. ], batch size: 58, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:01:09,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:01:11,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:16,557 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=83160.0, ans=0.1 2023-09-28 17:01:17,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:01:17,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 17:01:22,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 17:01:24,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=83226.66666666667, ans=0.0 2023-09-28 17:01:25,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 17:01:28,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:01:28,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 17:01:28,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:01:39,249 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=83226.66666666667, ans=0.0 2023-09-28 17:01:40,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:01:42,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 17:01:45,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:01:45,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:01:46,431 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.69 vs. limit=15.0 2023-09-28 17:01:49,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 17:01:49,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:49,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:01:51,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:01:53,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:56,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:00,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:02:00,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:00,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:02:00,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:02,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:04,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:02:08,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 17:02:08,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:11,959 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=10.45 vs. limit=15.0 2023-09-28 17:02:12,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:02:14,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:02:14,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 17:02:14,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 17:02:17,689 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 17:02:18,599 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.95 vs. limit=15.0 2023-09-28 17:02:19,206 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 17:02:20,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:02:20,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:02:20,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:22,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:24,525 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 17:02:24,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:02:24,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:26,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:02:27,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:02:29,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=83426.66666666667, ans=10.0 2023-09-28 17:02:30,305 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.984e+02 2.645e+02 2.967e+02 3.523e+02 5.465e+02, threshold=5.934e+02, percent-clipped=0.0 2023-09-28 17:02:30,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:02:30,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 17:02:31,883 INFO [train.py:1039] (0/4) Epoch 3, batch 1900, loss[loss=0.2897, simple_loss=0.3301, pruned_loss=0.1247, over 23731.00 frames. ], tot_loss[loss=0.2907, simple_loss=0.337, pruned_loss=0.1222, over 4712552.59 frames. ], batch size: 164, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:02:32,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:32,159 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 17:02:32,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:02:33,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:37,331 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=83493.33333333333, ans=10.0 2023-09-28 17:02:39,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:40,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=83493.33333333333, ans=0.07 2023-09-28 17:02:42,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:02:43,702 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 17:02:43,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 17:02:47,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:47,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:47,475 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 17:02:48,859 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 17:02:51,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 17:02:52,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:02:55,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 17:02:59,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 17:03:10,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 17:03:13,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 17:03:13,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:03:13,255 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 17:03:13,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 17:03:13,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 17:03:14,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 17:03:14,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:03:19,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 17:03:19,567 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=83626.66666666667, ans=0.2 2023-09-28 17:03:22,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:03:23,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=83693.33333333333, ans=0.1 2023-09-28 17:03:26,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:26,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 17:03:26,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:03:30,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=83693.33333333333, ans=0.0 2023-09-28 17:03:31,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 17:03:33,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:40,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:03:41,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:03:41,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:03:43,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:03:44,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:03:44,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:03:46,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:03:47,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:03:47,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:03:51,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:03:51,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:52,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:54,191 INFO [train.py:1039] (0/4) Epoch 3, batch 1950, loss[loss=0.409, simple_loss=0.4099, pruned_loss=0.2041, over 19374.00 frames. ], tot_loss[loss=0.2914, simple_loss=0.3382, pruned_loss=0.1223, over 4722799.14 frames. ], batch size: 389, lr: 3.00e-02, grad_scale: 16.0 2023-09-28 17:03:54,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:04:00,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:02,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:04:02,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:02,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:04:05,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 17:04:05,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:04:07,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:07,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:10,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:04:10,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:10,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:14,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:17,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:17,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:04:17,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:04:17,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:18,042 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.20 vs. limit=15.0 2023-09-28 17:04:21,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:24,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:04:24,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:24,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:04:24,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 17:04:26,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:04:27,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:04:27,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:28,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=83960.0, ans=0.0 2023-09-28 17:04:31,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:35,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:04:37,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=83960.0, ans=0.125 2023-09-28 17:04:40,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:04:42,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=84026.66666666667, ans=0.0 2023-09-28 17:04:44,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:04:44,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:04:44,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 17:04:45,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:04:47,953 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.79 vs. limit=12.0 2023-09-28 17:04:49,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:50,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:04:51,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=84026.66666666667, ans=0.07 2023-09-28 17:04:52,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:04:59,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:00,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:03,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:07,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:08,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:05:09,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:09,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 17:05:09,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:05:10,137 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=84093.33333333333, ans=0.0 2023-09-28 17:05:10,180 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=84093.33333333333, ans=0.125 2023-09-28 17:05:11,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:05:12,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 17:05:14,261 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.050e+02 2.608e+02 2.981e+02 3.638e+02 7.272e+02, threshold=5.963e+02, percent-clipped=1.0 2023-09-28 17:05:14,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:16,366 INFO [train.py:1039] (0/4) Epoch 3, batch 2000, loss[loss=0.3039, simple_loss=0.3651, pruned_loss=0.1214, over 24594.00 frames. ], tot_loss[loss=0.2931, simple_loss=0.3396, pruned_loss=0.1234, over 4710223.40 frames. ], batch size: 71, lr: 3.00e-02, grad_scale: 32.0 2023-09-28 17:05:18,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:05:19,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:05:19,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:05:21,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:05:23,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:26,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 17:05:26,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:05:29,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:05:29,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=84160.0, ans=0.125 2023-09-28 17:05:31,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 17:05:31,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:05:31,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:35,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:05:36,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 17:05:38,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:38,816 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=84226.66666666667, ans=0.5 2023-09-28 17:05:40,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:40,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:41,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 17:05:41,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:05:44,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 17:05:44,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:46,773 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:05:48,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:05:49,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:05:49,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:51,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:05:51,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:05:53,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 17:05:53,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=84293.33333333333, ans=0.2 2023-09-28 17:05:58,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 17:05:58,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:58,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:04,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:05,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:06:05,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:05,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:06:06,732 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.16 vs. limit=22.5 2023-09-28 17:06:08,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:08,390 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=84360.0, ans=10.0 2023-09-28 17:06:09,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:09,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:09,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:11,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:14,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:06:15,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 17:06:16,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=84360.0, ans=0.09899494936611666 2023-09-28 17:06:19,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:06:21,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:25,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:27,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:06:30,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:33,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:33,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:35,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:06:35,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:06:35,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.03 vs. limit=15.0 2023-09-28 17:06:38,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:39,524 INFO [train.py:1039] (0/4) Epoch 3, batch 2050, loss[loss=0.3108, simple_loss=0.3406, pruned_loss=0.1405, over 23733.00 frames. ], tot_loss[loss=0.2932, simple_loss=0.3392, pruned_loss=0.1236, over 4680473.51 frames. ], batch size: 179, lr: 2.99e-02, grad_scale: 32.0 2023-09-28 17:06:39,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:43,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:43,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:50,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:53,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:06:53,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:54,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:06:56,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 17:06:56,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:06:58,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:58,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:07:10,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:10,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:13,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 17:07:15,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:15,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 17:07:17,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:18,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:20,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=84626.66666666667, ans=0.125 2023-09-28 17:07:21,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:22,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:07:22,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:22,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=84626.66666666667, ans=0.125 2023-09-28 17:07:23,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:07:25,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:07:25,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:07:28,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:30,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:07:32,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:07:33,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=84693.33333333333, ans=0.125 2023-09-28 17:07:35,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:07:40,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:07:45,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:07:46,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 17:07:51,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:07:53,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:07:56,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:07:58,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 17:07:59,266 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=84760.0, ans=0.125 2023-09-28 17:08:01,465 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 17:08:01,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:02,769 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.062e+02 2.817e+02 3.171e+02 3.803e+02 7.947e+02, threshold=6.342e+02, percent-clipped=1.0 2023-09-28 17:08:02,812 INFO [train.py:1039] (0/4) Epoch 3, batch 2100, loss[loss=0.2611, simple_loss=0.3202, pruned_loss=0.101, over 24337.00 frames. ], tot_loss[loss=0.2907, simple_loss=0.337, pruned_loss=0.1222, over 4699696.46 frames. ], batch size: 61, lr: 2.99e-02, grad_scale: 16.0 2023-09-28 17:08:02,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:03,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:04,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:08:04,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 17:08:04,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 17:08:07,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:08:09,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:08:11,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:08:12,267 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.82 vs. limit=10.0 2023-09-28 17:08:15,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:16,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:08:16,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 17:08:16,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:08:16,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 17:08:16,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 17:08:19,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:19,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:08:19,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 17:08:20,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:08:25,392 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=84893.33333333333, ans=0.0 2023-09-28 17:08:28,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 17:08:28,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:31,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:08:31,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:34,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:08:36,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 17:08:36,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:36,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 17:08:39,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 17:08:39,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:39,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 17:08:41,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 17:08:41,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 17:08:41,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=84960.0, ans=10.0 2023-09-28 17:08:42,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:08:44,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:08:48,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:50,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:53,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:54,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:54,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 17:08:54,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:54,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:55,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:55,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 17:08:57,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 17:08:58,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 17:09:03,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:09:08,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:09:09,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 17:09:15,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:16,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:09:18,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:09:18,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:09:18,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 17:09:18,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:09:19,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:19,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:09:21,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:09:23,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:23,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 17:09:25,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 17:09:25,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:27,909 INFO [train.py:1039] (0/4) Epoch 3, batch 2150, loss[loss=0.3022, simple_loss=0.344, pruned_loss=0.1302, over 23159.00 frames. ], tot_loss[loss=0.2901, simple_loss=0.3366, pruned_loss=0.1218, over 4708691.79 frames. ], batch size: 105, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:09:31,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:09:31,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:09:31,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:09:32,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:09:37,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:09:40,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:40,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:43,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:09:43,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:43,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:09:46,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=85226.66666666667, ans=0.125 2023-09-28 17:09:47,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:47,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:09:47,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:09:49,523 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.28 vs. limit=15.0 2023-09-28 17:09:52,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:52,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 17:09:57,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:09:59,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:10:00,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=85293.33333333333, ans=10.0 2023-09-28 17:10:01,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:01,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:10:04,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:04,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:10:04,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=85293.33333333333, ans=0.125 2023-09-28 17:10:05,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:10:06,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 17:10:09,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:10:10,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:10,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:11,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:10:13,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:10:14,029 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=85293.33333333333, ans=0.1 2023-09-28 17:10:16,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:16,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:10:18,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:18,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 17:10:18,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:10:21,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:21,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:22,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:23,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:10:24,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:24,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:26,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 17:10:28,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 17:10:28,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:10:29,669 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 17:10:29,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:29,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:10:31,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 17:10:31,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:10:31,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 17:10:31,319 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 17:10:31,319 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 17:10:33,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 17:10:35,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:35,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:35,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:10:37,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:38,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:10:40,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:40,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:48,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:10:50,167 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.981e+02 2.450e+02 2.912e+02 3.382e+02 5.716e+02, threshold=5.824e+02, percent-clipped=0.0 2023-09-28 17:10:50,211 INFO [train.py:1039] (0/4) Epoch 3, batch 2200, loss[loss=0.2957, simple_loss=0.3275, pruned_loss=0.132, over 23409.00 frames. ], tot_loss[loss=0.2888, simple_loss=0.3359, pruned_loss=0.1209, over 4709080.11 frames. ], batch size: 285, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:10:50,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 17:10:53,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:10:54,518 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.18 vs. limit=15.0 2023-09-28 17:10:56,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:58,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:11:00,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:01,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:11:02,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=85493.33333333333, ans=0.125 2023-09-28 17:11:05,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:11:06,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:11:06,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 17:11:08,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=85560.0, ans=0.125 2023-09-28 17:11:12,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 17:11:14,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:11:20,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 17:11:21,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:23,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:23,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:11:28,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:11:28,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 17:11:31,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:11:32,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:34,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:11:38,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:11:38,953 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.52 vs. limit=15.0 2023-09-28 17:11:39,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:41,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:11:42,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:45,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 17:11:46,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:48,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 17:11:50,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:50,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:11:52,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:53,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:55,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:55,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:55,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:56,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=85760.0, ans=0.0 2023-09-28 17:11:58,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:11:58,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:11:59,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:12:02,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:12:02,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:06,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:12:07,697 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 17:12:08,368 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.35 vs. limit=6.0 2023-09-28 17:12:09,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:12:11,183 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 17:12:13,190 INFO [train.py:1039] (0/4) Epoch 3, batch 2250, loss[loss=0.2712, simple_loss=0.3197, pruned_loss=0.1113, over 23889.00 frames. ], tot_loss[loss=0.2891, simple_loss=0.3363, pruned_loss=0.121, over 4714672.85 frames. ], batch size: 195, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:12:13,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:12:13,366 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 17:12:14,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:15,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:12:16,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:18,569 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 17:12:20,706 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.94 vs. limit=22.5 2023-09-28 17:12:21,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:12:23,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:28,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:12:29,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:12:34,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:34,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:34,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:37,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 17:12:37,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:12:37,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:12:39,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=85893.33333333333, ans=0.125 2023-09-28 17:12:40,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 17:12:40,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:12:40,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:43,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:47,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:49,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:12:49,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:12:50,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 17:12:52,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:55,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:13:01,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:04,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:13:04,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=86026.66666666667, ans=0.0 2023-09-28 17:13:04,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=86026.66666666667, ans=0.1 2023-09-28 17:13:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:13:09,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:13:12,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:13:15,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:13:22,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:13:22,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:13:22,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:13:29,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:13:32,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:13:32,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 17:13:32,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:32,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:13:34,489 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=86160.0, ans=10.0 2023-09-28 17:13:35,615 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.481e+02 2.992e+02 3.507e+02 5.214e+02, threshold=5.985e+02, percent-clipped=0.0 2023-09-28 17:13:35,659 INFO [train.py:1039] (0/4) Epoch 3, batch 2300, loss[loss=0.3213, simple_loss=0.3499, pruned_loss=0.1463, over 23541.00 frames. ], tot_loss[loss=0.2891, simple_loss=0.3361, pruned_loss=0.121, over 4715813.87 frames. ], batch size: 256, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:13:35,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 17:13:37,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=86160.0, ans=0.04949747468305833 2023-09-28 17:13:38,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:13:39,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:43,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:44,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:13:45,558 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 17:13:47,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:55,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:13:55,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:13:55,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:13:57,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:57,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 17:13:59,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:14:02,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:02,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:14:02,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=86226.66666666667, ans=0.1 2023-09-28 17:14:02,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=86226.66666666667, ans=0.125 2023-09-28 17:14:07,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:14:10,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:14:13,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:15,857 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.92 vs. limit=10.0 2023-09-28 17:14:17,391 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.20 vs. limit=15.0 2023-09-28 17:14:21,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:14:21,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:14:24,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:14:26,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:14:31,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:32,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:14:33,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:14:33,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 17:14:38,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:14:38,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:38,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:38,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:14:38,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:40,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:14:40,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:14:40,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 17:14:40,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:14:40,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:43,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 17:14:49,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:14:52,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:14:55,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=86426.66666666667, ans=0.1 2023-09-28 17:14:57,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:57,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:14:57,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:14:57,381 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=86493.33333333333, ans=0.125 2023-09-28 17:14:58,587 INFO [train.py:1039] (0/4) Epoch 3, batch 2350, loss[loss=0.2913, simple_loss=0.3231, pruned_loss=0.1297, over 23771.00 frames. ], tot_loss[loss=0.2909, simple_loss=0.3372, pruned_loss=0.1223, over 4703516.60 frames. ], batch size: 164, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:14:58,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:14:58,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:00,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:15:00,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 17:15:07,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:07,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 17:15:07,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=86493.33333333333, ans=0.09899494936611666 2023-09-28 17:15:12,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 17:15:16,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:15:19,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:19,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:21,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 17:15:24,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:15:30,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 17:15:33,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:34,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:15:34,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:38,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:15:40,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 17:15:41,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:15:45,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:45,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:15:45,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:15:48,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:15:50,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 17:15:52,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:53,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:54,786 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=6.43 vs. limit=15.0 2023-09-28 17:15:55,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:15:56,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 17:15:56,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:15:57,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=86693.33333333333, ans=0.125 2023-09-28 17:16:01,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 17:16:01,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:16:06,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 17:16:07,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 17:16:08,193 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=86760.0, ans=0.2 2023-09-28 17:16:09,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:16:09,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:16:10,618 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 17:16:10,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 17:16:12,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=86760.0, ans=0.125 2023-09-28 17:16:12,675 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=15.09 vs. limit=15.0 2023-09-28 17:16:13,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 17:16:16,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:16:20,375 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.689e+02 3.044e+02 3.623e+02 6.836e+02, threshold=6.088e+02, percent-clipped=1.0 2023-09-28 17:16:20,417 INFO [train.py:1039] (0/4) Epoch 3, batch 2400, loss[loss=0.2783, simple_loss=0.3425, pruned_loss=0.107, over 24461.00 frames. ], tot_loss[loss=0.288, simple_loss=0.3354, pruned_loss=0.1203, over 4718478.40 frames. ], batch size: 69, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:16:20,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=86826.66666666667, ans=0.0 2023-09-28 17:16:21,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:16:24,685 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=16.42 vs. limit=15.0 2023-09-28 17:16:26,457 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.41 vs. limit=22.5 2023-09-28 17:16:27,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:16:27,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=86826.66666666667, ans=0.07 2023-09-28 17:16:27,660 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.54 vs. limit=15.0 2023-09-28 17:16:28,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:16:28,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 17:16:30,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 17:16:36,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:16:36,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:16:39,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 17:16:39,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:16:39,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:40,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 17:16:44,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:47,915 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=86893.33333333333, ans=0.1 2023-09-28 17:16:49,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 17:16:56,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:17:00,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 17:17:05,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:05,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:09,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:09,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 17:17:10,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:17:16,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:19,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:21,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:23,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:17:23,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:17:23,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:17:23,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:24,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:24,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:17:29,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:17:31,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:17:31,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 17:17:34,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 17:17:35,004 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.83 vs. limit=15.0 2023-09-28 17:17:35,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:37,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:37,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 17:17:38,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 17:17:38,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 17:17:38,826 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 17:17:38,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 17:17:39,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=87093.33333333333, ans=0.1 2023-09-28 17:17:40,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:40,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:40,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:42,285 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 17:17:43,647 INFO [train.py:1039] (0/4) Epoch 3, batch 2450, loss[loss=0.2737, simple_loss=0.3429, pruned_loss=0.1023, over 24596.00 frames. ], tot_loss[loss=0.2869, simple_loss=0.3349, pruned_loss=0.1194, over 4728562.99 frames. ], batch size: 71, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:17:43,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:43,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:17:48,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:17:48,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:53,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:53,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:53,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 17:17:57,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:57,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:03,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:18:03,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:18:03,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:18:03,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 17:18:05,862 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.49 vs. limit=15.0 2023-09-28 17:18:09,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:11,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:18:12,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:18:15,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:18:15,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:15,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:17,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:18:18,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 17:18:19,162 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:18:19,953 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.10 vs. limit=12.0 2023-09-28 17:18:20,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:18:22,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=87293.33333333333, ans=0.0 2023-09-28 17:18:29,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:29,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:31,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:31,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:18:33,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:35,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:18:36,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 17:18:40,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:40,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:18:42,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:18:43,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:48,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=87360.0, ans=0.125 2023-09-28 17:18:48,381 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=87360.0, ans=0.0 2023-09-28 17:18:49,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:18:49,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 17:18:51,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:18:52,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:18:52,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 17:18:54,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:18:54,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:18:57,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:19:00,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:00,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:19:04,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 17:19:05,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:19:07,580 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.571e+02 3.066e+02 3.811e+02 5.963e+02, threshold=6.132e+02, percent-clipped=0.0 2023-09-28 17:19:07,642 INFO [train.py:1039] (0/4) Epoch 3, batch 2500, loss[loss=0.3028, simple_loss=0.3577, pruned_loss=0.124, over 23666.00 frames. ], tot_loss[loss=0.2855, simple_loss=0.3334, pruned_loss=0.1188, over 4717202.10 frames. ], batch size: 85, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:19:12,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:22,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:19:22,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:19:23,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:23,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 17:19:31,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:19:33,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:19:33,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:19:34,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:19:37,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 17:19:37,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:38,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:38,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 17:19:38,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:38,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 17:19:40,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:44,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:44,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:48,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:19:49,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 17:19:51,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:19:51,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:56,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:58,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=87693.33333333333, ans=0.1 2023-09-28 17:19:59,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:02,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:08,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:20:13,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 17:20:13,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:13,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:13,735 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.97 vs. limit=15.0 2023-09-28 17:20:14,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:20:14,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:20:15,012 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 17:20:15,013 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 17:20:15,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 17:20:19,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:21,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 17:20:21,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 17:20:23,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:24,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 17:20:27,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 17:20:30,815 INFO [train.py:1039] (0/4) Epoch 3, batch 2550, loss[loss=0.2968, simple_loss=0.3511, pruned_loss=0.1213, over 24025.00 frames. ], tot_loss[loss=0.2865, simple_loss=0.3341, pruned_loss=0.1195, over 4720522.48 frames. ], batch size: 80, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:20:30,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:34,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:20:34,567 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.53 vs. limit=15.0 2023-09-28 17:20:35,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:20:35,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:37,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 17:20:37,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:20:41,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 17:20:43,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:20:45,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:47,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:47,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 17:20:49,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:20:49,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:20:49,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:53,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:20:53,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 17:20:53,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:53,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:53,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 17:21:06,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=87960.0, ans=0.125 2023-09-28 17:21:08,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:21:09,205 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=87960.0, ans=0.125 2023-09-28 17:21:13,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:13,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:13,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:21:15,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:21:22,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:21:25,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:21:25,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:21:25,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:21:25,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:21:25,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:21:29,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:29,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:30,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=88026.66666666667, ans=0.125 2023-09-28 17:21:34,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:21:36,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 17:21:36,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:21:37,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:37,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:21:39,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:21:40,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:46,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=88093.33333333333, ans=0.2 2023-09-28 17:21:49,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:21:51,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:53,263 INFO [train.py:1039] (0/4) Epoch 3, batch 2600, loss[loss=0.2993, simple_loss=0.342, pruned_loss=0.1283, over 23827.00 frames. ], tot_loss[loss=0.2872, simple_loss=0.3352, pruned_loss=0.1196, over 4724806.30 frames. ], batch size: 195, lr: 2.95e-02, grad_scale: 16.0 2023-09-28 17:21:54,710 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.952e+02 2.618e+02 3.140e+02 3.668e+02 6.690e+02, threshold=6.281e+02, percent-clipped=1.0 2023-09-28 17:21:54,939 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 17:21:58,535 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 17:21:58,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:22:00,109 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 17:22:00,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 17:22:00,257 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 17:22:03,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:22:03,394 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 17:22:05,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 17:22:07,743 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 17:22:07,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=88160.0, ans=0.1 2023-09-28 17:22:09,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:22:10,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 17:22:12,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 17:22:13,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:22:13,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 17:22:16,832 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 17:22:16,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 17:22:20,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=88226.66666666667, ans=0.0 2023-09-28 17:22:24,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:24,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:24,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:24,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 17:22:27,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:22:35,218 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 17:22:39,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:42,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:42,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 17:22:42,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:22:42,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:44,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 17:22:44,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=88360.0, ans=0.0 2023-09-28 17:22:46,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:22:46,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:22:46,393 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=88360.0, ans=0.125 2023-09-28 17:22:47,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:52,155 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 17:22:53,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:53,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:22:53,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=88360.0, ans=0.1 2023-09-28 17:23:00,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:23:00,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:23:00,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 17:23:01,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:23:03,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:04,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:11,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 17:23:11,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:14,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:23:16,489 INFO [train.py:1039] (0/4) Epoch 3, batch 2650, loss[loss=0.3223, simple_loss=0.3541, pruned_loss=0.1452, over 23798.00 frames. ], tot_loss[loss=0.2873, simple_loss=0.3353, pruned_loss=0.1196, over 4718140.19 frames. ], batch size: 212, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:23:20,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 17:23:21,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:21,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:23:23,388 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 17:23:23,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:23:25,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:28,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:23:29,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:32,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:23:34,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 17:23:34,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:23:34,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:23:37,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 17:23:39,536 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 17:23:42,325 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.21 vs. limit=8.0 2023-09-28 17:23:43,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:46,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 17:23:46,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:23:46,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 17:23:50,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:50,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:23:51,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:51,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:23:56,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 17:23:58,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 17:23:59,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:02,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 17:24:02,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:02,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:02,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:04,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:04,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:06,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:08,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:09,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:24:09,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:24:11,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:24:13,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:14,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:24:14,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:16,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:16,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:24:20,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:22,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:24:24,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:24,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 17:24:27,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:29,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:32,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:34,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:34,458 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:24:35,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:35,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:37,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:24:37,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 17:24:38,913 INFO [train.py:1039] (0/4) Epoch 3, batch 2700, loss[loss=0.2719, simple_loss=0.3255, pruned_loss=0.1092, over 24471.00 frames. ], tot_loss[loss=0.2881, simple_loss=0.3368, pruned_loss=0.1197, over 4727396.98 frames. ], batch size: 63, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:24:40,996 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.674e+02 3.068e+02 3.788e+02 5.664e+02, threshold=6.136e+02, percent-clipped=0.0 2023-09-28 17:24:41,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:24:42,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:24:43,828 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.76 vs. limit=15.0 2023-09-28 17:24:44,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:45,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:46,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:49,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:24:49,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:49,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:24:49,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:24:50,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 17:24:52,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:24:52,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:54,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:24:54,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:58,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:25:00,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 17:25:00,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:00,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=88893.33333333333, ans=0.0 2023-09-28 17:25:05,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:25:05,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:12,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:25:12,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:25:14,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:25:14,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:25:17,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:18,148 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.71 vs. limit=15.0 2023-09-28 17:25:21,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:22,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:25:22,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:25:27,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:27,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:25:34,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:25:36,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:25:40,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:25:40,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:25:44,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:44,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:46,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:48,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:25:49,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:49,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:25:51,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=89093.33333333333, ans=0.125 2023-09-28 17:25:53,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:54,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:54,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:57,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 17:25:59,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:02,657 INFO [train.py:1039] (0/4) Epoch 3, batch 2750, loss[loss=0.3179, simple_loss=0.3439, pruned_loss=0.146, over 23587.00 frames. ], tot_loss[loss=0.2882, simple_loss=0.336, pruned_loss=0.1202, over 4710282.21 frames. ], batch size: 256, lr: 2.93e-02, grad_scale: 16.0 2023-09-28 17:26:02,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:26:02,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 17:26:04,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 17:26:04,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:07,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:07,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:10,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:10,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:26:10,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:16,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:17,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:26:17,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:26:17,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:17,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 17:26:17,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:26:17,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:24,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 17:26:27,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:26:27,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:29,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:26:29,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:26:30,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:32,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:26:32,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:33,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:37,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:26:37,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:26:39,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:26:39,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:42,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:26:44,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=89293.33333333333, ans=0.0 2023-09-28 17:26:47,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:50,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:26:50,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:54,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:54,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:26:54,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:27:01,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:27:03,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:27:03,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 17:27:07,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:09,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 17:27:14,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:27:17,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:27:17,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 17:27:19,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:27:23,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:27:23,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 17:27:23,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:27:26,167 INFO [train.py:1039] (0/4) Epoch 3, batch 2800, loss[loss=0.2733, simple_loss=0.3066, pruned_loss=0.12, over 22650.00 frames. ], tot_loss[loss=0.2879, simple_loss=0.335, pruned_loss=0.1204, over 4693409.86 frames. ], batch size: 322, lr: 2.93e-02, grad_scale: 32.0 2023-09-28 17:27:27,581 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.563e+02 3.005e+02 3.573e+02 5.260e+02, threshold=6.010e+02, percent-clipped=0.0 2023-09-28 17:27:27,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:27:27,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:27,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:27:29,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 17:27:29,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:29,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:31,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:32,573 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 17:27:32,574 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 17:27:35,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:37,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:27:37,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:27:42,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:27:44,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 17:27:47,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:27:49,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 17:27:50,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:50,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:27:50,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:27:54,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:27:54,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:54,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:27:56,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:04,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:28:07,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:10,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:10,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:28:11,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:17,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:17,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 17:28:17,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:21,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:21,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:28:24,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:25,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:30,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:32,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:28:32,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:32,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:28:32,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:28:33,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:28:34,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:28:34,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 17:28:34,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:36,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:36,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:37,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 17:28:38,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=89760.0, ans=0.2 2023-09-28 17:28:39,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:39,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:28:40,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:28:43,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 17:28:45,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=89760.0, ans=0.0 2023-09-28 17:28:49,328 INFO [train.py:1039] (0/4) Epoch 3, batch 2850, loss[loss=0.3074, simple_loss=0.339, pruned_loss=0.1379, over 23411.00 frames. ], tot_loss[loss=0.2864, simple_loss=0.3338, pruned_loss=0.1195, over 4695192.76 frames. ], batch size: 285, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:28:49,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:49,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:28:51,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:28:52,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:28:56,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:28:56,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:56,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:29:00,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=89826.66666666667, ans=0.125 2023-09-28 17:29:01,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:01,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:29:01,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=89826.66666666667, ans=0.125 2023-09-28 17:29:02,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:29:02,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 17:29:10,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 17:29:10,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:12,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 17:29:13,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:16,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=89893.33333333333, ans=0.125 2023-09-28 17:29:17,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 17:29:17,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 17:29:19,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:31,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:32,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:32,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:29:34,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:29:34,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:29:34,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:29:37,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:29:37,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 17:29:41,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:29:41,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:29:42,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:42,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:44,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:48,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:51,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:29:52,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:52,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:53,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:29:58,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:30:00,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 17:30:00,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 17:30:03,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:30:05,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:05,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 17:30:05,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:30:06,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:06,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:06,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:30:06,928 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 17:30:08,382 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 17:30:08,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:08,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:13,027 INFO [train.py:1039] (0/4) Epoch 3, batch 2900, loss[loss=0.3604, simple_loss=0.3753, pruned_loss=0.1727, over 19252.00 frames. ], tot_loss[loss=0.2852, simple_loss=0.3327, pruned_loss=0.1189, over 4703962.79 frames. ], batch size: 388, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:30:13,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:15,031 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.599e+02 2.941e+02 3.399e+02 5.344e+02, threshold=5.883e+02, percent-clipped=0.0 2023-09-28 17:30:15,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:15,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:30:17,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 17:30:22,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:22,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 17:30:22,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 17:30:24,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:30:24,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:30:26,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:27,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:30:27,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=90160.0, ans=0.1 2023-09-28 17:30:32,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:32,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:37,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:30:37,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 17:30:38,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:30:39,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=90226.66666666667, ans=0.125 2023-09-28 17:30:40,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:43,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 17:30:43,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 17:30:48,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:48,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 17:30:48,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:30:49,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:30:51,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:52,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:53,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:55,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:58,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:00,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 17:31:00,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 17:31:00,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:31:04,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:31:06,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 17:31:06,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:31:11,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:31:21,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:31:21,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:31:23,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 17:31:27,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:27,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 17:31:28,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:29,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:31:35,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:35,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=90493.33333333333, ans=0.2 2023-09-28 17:31:36,393 INFO [train.py:1039] (0/4) Epoch 3, batch 2950, loss[loss=0.2994, simple_loss=0.3381, pruned_loss=0.1303, over 23454.00 frames. ], tot_loss[loss=0.2852, simple_loss=0.3333, pruned_loss=0.1185, over 4710112.64 frames. ], batch size: 134, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:31:36,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 17:31:38,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:38,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:39,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:31:41,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:31:43,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 17:31:44,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 17:31:45,096 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=90493.33333333333, ans=0.0 2023-09-28 17:31:45,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=90493.33333333333, ans=0.125 2023-09-28 17:31:46,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:31:46,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:52,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:31:55,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:31:57,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:31:57,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:31:59,602 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.34 vs. limit=10.0 2023-09-28 17:31:59,627 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.96 vs. limit=15.0 2023-09-28 17:32:02,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:02,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:32:04,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:32:07,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 17:32:08,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=90626.66666666667, ans=0.02 2023-09-28 17:32:12,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 17:32:12,770 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 17:32:12,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:32:14,464 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 17:32:14,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=90626.66666666667, ans=0.125 2023-09-28 17:32:16,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 17:32:16,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:32:18,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:32:18,350 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 17:32:18,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:32:22,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 17:32:22,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:32:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:32:25,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:28,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:32:28,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:28,821 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 17:32:28,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:30,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 17:32:33,519 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.88 vs. limit=15.0 2023-09-28 17:32:36,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:38,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:32:38,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 17:32:38,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:32:40,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 17:32:43,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:43,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:45,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:32:46,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:46,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:32:47,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:32:48,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:48,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:32:49,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:32:50,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:51,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:32:54,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:54,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 17:32:56,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:59,384 INFO [train.py:1039] (0/4) Epoch 3, batch 3000, loss[loss=0.2704, simple_loss=0.3318, pruned_loss=0.1045, over 24438.00 frames. ], tot_loss[loss=0.2871, simple_loss=0.3351, pruned_loss=0.1195, over 4713286.02 frames. ], batch size: 66, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:32:59,385 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 17:33:13,927 INFO [train.py:1071] (0/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3326, pruned_loss=0.2311, over 1125622.00 frames. 2023-09-28 17:33:13,928 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 17:33:15,398 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.502e+02 2.937e+02 3.419e+02 4.607e+02, threshold=5.874e+02, percent-clipped=0.0 2023-09-28 17:33:15,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:33:16,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:33:18,699 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 17:33:20,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 17:33:23,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:33:23,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:33:24,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 17:33:24,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:30,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=90893.33333333333, ans=0.2 2023-09-28 17:33:32,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:33:42,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:33:48,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 17:33:50,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:33:54,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:33:54,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:54,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:33:57,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:33:57,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 17:34:00,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 17:34:03,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:34:03,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:34:05,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:34:05,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:07,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:07,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:34:10,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:34:10,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:34:10,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:34:11,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:13,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 17:34:14,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:34:15,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:16,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:34:21,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:21,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:21,614 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=91093.33333333333, ans=0.2 2023-09-28 17:34:22,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:34:22,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 17:34:25,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:34:25,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 17:34:25,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:34:30,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 17:34:31,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:34:33,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:34:33,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 17:34:34,791 INFO [train.py:1039] (0/4) Epoch 3, batch 3050, loss[loss=0.2894, simple_loss=0.3434, pruned_loss=0.1177, over 24035.00 frames. ], tot_loss[loss=0.2879, simple_loss=0.3359, pruned_loss=0.1199, over 4725181.76 frames. ], batch size: 80, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:34:34,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 17:34:34,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:34:36,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:34:38,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:38,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:34:38,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:40,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:34:41,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 17:34:41,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=91160.0, ans=0.025 2023-09-28 17:34:43,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:34:43,449 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=91160.0, ans=0.125 2023-09-28 17:34:46,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:34:47,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:34:49,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=91226.66666666667, ans=0.2 2023-09-28 17:34:50,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:54,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 17:35:02,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 17:35:02,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 17:35:02,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:07,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:35:09,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:09,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:11,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:14,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:15,065 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=91293.33333333333, ans=0.125 2023-09-28 17:35:16,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:35:16,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:16,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:16,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:17,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:20,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:22,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:22,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 17:35:23,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:23,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:35:27,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:35:27,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:35:27,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:35:29,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:30,107 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.46 vs. limit=12.0 2023-09-28 17:35:33,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:34,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:39,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:40,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:35:40,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:41,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=91426.66666666667, ans=10.0 2023-09-28 17:35:42,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:42,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:35:42,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:44,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 17:35:46,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:46,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:47,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 17:35:50,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:53,822 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=91426.66666666667, ans=0.125 2023-09-28 17:35:55,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:57,535 INFO [train.py:1039] (0/4) Epoch 3, batch 3100, loss[loss=0.2943, simple_loss=0.3354, pruned_loss=0.1266, over 23724.00 frames. ], tot_loss[loss=0.2873, simple_loss=0.335, pruned_loss=0.1198, over 4722286.02 frames. ], batch size: 149, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:35:57,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:35:58,658 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.21 vs. limit=15.0 2023-09-28 17:35:59,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:36:00,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.573e+02 3.095e+02 3.783e+02 7.787e+02, threshold=6.189e+02, percent-clipped=2.0 2023-09-28 17:36:00,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 17:36:03,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 17:36:05,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 17:36:07,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:36:10,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:36:12,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:13,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:36:19,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:25,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 17:36:29,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:36:29,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=91626.66666666667, ans=0.125 2023-09-28 17:36:31,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:32,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:36:33,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:36:33,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:36:35,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:36:35,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 17:36:35,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:36:36,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:39,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 17:36:39,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:36:43,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:36:44,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 17:36:45,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 17:36:47,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:47,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:50,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:36:50,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:50,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:36:53,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:36:53,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:36:54,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:36:55,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:36:55,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:55,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 17:37:00,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:37:00,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=91693.33333333333, ans=0.125 2023-09-28 17:37:01,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 17:37:05,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:37:05,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 17:37:06,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:07,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:08,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 17:37:19,672 INFO [train.py:1039] (0/4) Epoch 3, batch 3150, loss[loss=0.2612, simple_loss=0.2787, pruned_loss=0.1218, over 19206.00 frames. ], tot_loss[loss=0.2856, simple_loss=0.3331, pruned_loss=0.119, over 4729324.48 frames. ], batch size: 388, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:37:19,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 17:37:22,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:23,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:25,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:37:25,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:37:25,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 17:37:27,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:27,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:37:28,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 17:37:30,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:30,781 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=91826.66666666667, ans=0.0 2023-09-28 17:37:32,287 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 17:37:36,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 17:37:36,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:37:39,108 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 17:37:39,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:37:40,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 17:37:40,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 17:37:40,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 17:37:40,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:40,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:37:42,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:44,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=91893.33333333333, ans=0.125 2023-09-28 17:37:45,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 17:37:47,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:47,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:48,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:50,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:37:54,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 17:37:54,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:37:56,611 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=91960.0, ans=0.125 2023-09-28 17:37:57,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:37:57,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:59,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 17:38:03,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 17:38:04,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:38:04,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:38:04,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:38:06,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:06,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:38:06,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:38:07,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:38:09,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 17:38:09,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:38:09,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:11,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:38:11,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:38:13,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 17:38:13,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:14,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 17:38:16,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:17,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 17:38:19,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 17:38:20,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:38:20,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:21,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 17:38:22,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:38:22,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:25,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:38:27,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:27,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:38:31,374 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.34 vs. limit=15.0 2023-09-28 17:38:34,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:38:34,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:37,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 17:38:43,053 INFO [train.py:1039] (0/4) Epoch 3, batch 3200, loss[loss=0.2879, simple_loss=0.3378, pruned_loss=0.119, over 24039.00 frames. ], tot_loss[loss=0.2844, simple_loss=0.3316, pruned_loss=0.1186, over 4723638.31 frames. ], batch size: 80, lr: 2.90e-02, grad_scale: 32.0 2023-09-28 17:38:43,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:38:43,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:38:46,890 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.531e+02 2.998e+02 3.452e+02 5.958e+02, threshold=5.995e+02, percent-clipped=0.0 2023-09-28 17:38:47,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:48,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:38:48,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 17:38:51,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:54,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:38:59,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:39:05,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=92226.66666666667, ans=0.125 2023-09-28 17:39:08,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:39:19,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 17:39:21,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:39:24,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 17:39:24,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:39:27,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:39:27,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:39:29,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:39:32,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 17:39:34,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:39:38,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 17:39:43,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 17:39:45,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:39:50,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=92426.66666666667, ans=0.125 2023-09-28 17:39:51,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:51,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:39:51,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:51,355 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 17:39:51,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:39:51,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=92426.66666666667, ans=0.125 2023-09-28 17:39:55,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:39:56,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 17:39:56,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 17:39:58,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 17:39:59,085 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.42 vs. limit=15.0 2023-09-28 17:39:59,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 17:40:01,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:40:05,804 INFO [train.py:1039] (0/4) Epoch 3, batch 3250, loss[loss=0.2995, simple_loss=0.3413, pruned_loss=0.1289, over 23604.00 frames. ], tot_loss[loss=0.2836, simple_loss=0.3315, pruned_loss=0.1178, over 4720149.51 frames. ], batch size: 256, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:40:05,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:40:05,910 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 17:40:05,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:05,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:07,459 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 17:40:10,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:40:15,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:15,968 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=92493.33333333333, ans=0.125 2023-09-28 17:40:16,631 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.99 vs. limit=6.0 2023-09-28 17:40:19,122 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=92493.33333333333, ans=0.125 2023-09-28 17:40:22,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:40:22,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 17:40:22,527 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=92560.0, ans=0.0 2023-09-28 17:40:23,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:23,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:40:25,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:25,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=92560.0, ans=0.1 2023-09-28 17:40:27,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:27,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:40:27,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=92560.0, ans=0.0 2023-09-28 17:40:30,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:40:30,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:30,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:40:30,762 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:40:32,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:33,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:34,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=92560.0, ans=0.125 2023-09-28 17:40:35,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:35,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:37,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:37,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:37,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:40:40,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=92626.66666666667, ans=0.1 2023-09-28 17:40:42,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 17:40:43,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:43,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:40:43,623 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=92626.66666666667, ans=0.0 2023-09-28 17:40:46,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:46,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:40:52,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:41:00,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:00,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:00,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 17:41:00,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:41:02,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:41:02,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:05,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 17:41:05,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 17:41:05,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=92693.33333333333, ans=0.125 2023-09-28 17:41:06,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:41:06,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:08,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:08,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:41:09,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:12,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:12,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:15,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 17:41:15,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:18,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:41:18,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 17:41:22,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=92760.0, ans=0.1 2023-09-28 17:41:23,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:23,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 17:41:25,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 17:41:26,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 17:41:26,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:29,900 INFO [train.py:1039] (0/4) Epoch 3, batch 3300, loss[loss=0.2946, simple_loss=0.3336, pruned_loss=0.1278, over 23709.00 frames. ], tot_loss[loss=0.2841, simple_loss=0.3319, pruned_loss=0.1181, over 4714295.61 frames. ], batch size: 256, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:41:30,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:31,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:41:31,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:33,735 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.895e+02 2.576e+02 3.097e+02 3.556e+02 6.978e+02, threshold=6.193e+02, percent-clipped=2.0 2023-09-28 17:41:34,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:41:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:41:35,857 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:41:37,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:40,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:41,073 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.47 vs. limit=10.0 2023-09-28 17:41:43,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 17:41:44,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:41:44,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:46,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=92893.33333333333, ans=0.2 2023-09-28 17:41:47,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:47,700 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 17:41:49,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:41:49,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:41:51,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:41:51,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:41:51,495 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 17:41:58,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:58,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:42:00,346 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=92893.33333333333, ans=0.0 2023-09-28 17:42:01,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:01,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 17:42:03,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 17:42:03,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:03,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=92960.0, ans=0.125 2023-09-28 17:42:04,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:42:06,446 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 17:42:09,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 17:42:09,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:13,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 17:42:17,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:18,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=93026.66666666667, ans=0.125 2023-09-28 17:42:19,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:42:20,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:23,071 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.23 vs. limit=12.0 2023-09-28 17:42:23,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:23,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:23,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:42:23,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:42:26,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:42:26,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:26,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:42:28,338 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 17:42:30,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 17:42:32,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:42:32,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:42:32,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:34,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:34,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:36,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:42:37,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:37,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:42:37,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:39,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:42:42,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 17:42:44,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:44,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:47,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:42:47,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:47,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=93093.33333333333, ans=0.1 2023-09-28 17:42:50,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:50,956 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=93160.0, ans=0.2 2023-09-28 17:42:52,059 INFO [train.py:1039] (0/4) Epoch 3, batch 3350, loss[loss=0.2743, simple_loss=0.3455, pruned_loss=0.1015, over 24519.00 frames. ], tot_loss[loss=0.2857, simple_loss=0.3337, pruned_loss=0.1188, over 4719501.64 frames. ], batch size: 71, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:42:52,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:52,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:53,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:55,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:56,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:57,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=93160.0, ans=0.125 2023-09-28 17:42:59,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:02,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:43:03,600 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.91 vs. limit=15.0 2023-09-28 17:43:05,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:05,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:43:06,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 17:43:08,373 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 17:43:08,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:14,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 17:43:14,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 17:43:14,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:43:16,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:43:16,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:17,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 17:43:17,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:17,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:43:20,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:22,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:22,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:24,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:43:27,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:29,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:29,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=93293.33333333333, ans=0.0 2023-09-28 17:43:30,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:30,916 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=93293.33333333333, ans=0.125 2023-09-28 17:43:33,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:43:35,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:36,350 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.38 vs. limit=22.5 2023-09-28 17:43:39,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:39,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:42,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:45,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 17:43:45,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:43:45,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 17:43:45,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:43:47,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 17:43:49,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:50,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:57,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:57,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 17:43:57,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:43:59,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:44:00,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:44:05,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:08,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 17:44:10,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:44:10,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:44:11,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:12,980 INFO [train.py:1039] (0/4) Epoch 3, batch 3400, loss[loss=0.2852, simple_loss=0.3212, pruned_loss=0.1246, over 23444.00 frames. ], tot_loss[loss=0.2868, simple_loss=0.3347, pruned_loss=0.1194, over 4712122.68 frames. ], batch size: 134, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:44:13,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 17:44:13,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:44:13,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 17:44:15,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:15,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:15,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:44:17,386 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.863e+02 2.557e+02 2.981e+02 3.725e+02 6.496e+02, threshold=5.961e+02, percent-clipped=1.0 2023-09-28 17:44:17,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:44:18,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 17:44:22,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 17:44:22,692 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 17:44:22,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:44:22,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=93493.33333333333, ans=0.0 2023-09-28 17:44:28,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:28,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:44:28,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:29,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:44:34,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=93560.0, ans=0.0 2023-09-28 17:44:35,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:44:37,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 17:44:42,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=93560.0, ans=0.1 2023-09-28 17:44:43,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:44:45,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:46,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:46,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:44:51,919 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=93626.66666666667, ans=0.125 2023-09-28 17:44:55,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:44:57,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=93626.66666666667, ans=0.1 2023-09-28 17:45:00,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 17:45:04,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:05,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:05,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 17:45:07,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:07,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:07,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:45:08,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:45:11,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:45:16,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:45:16,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:45:22,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:24,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 17:45:32,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=93760.0, ans=0.125 2023-09-28 17:45:33,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:45:36,560 INFO [train.py:1039] (0/4) Epoch 3, batch 3450, loss[loss=0.3056, simple_loss=0.3575, pruned_loss=0.1268, over 23925.00 frames. ], tot_loss[loss=0.2872, simple_loss=0.3347, pruned_loss=0.1198, over 4711999.15 frames. ], batch size: 86, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:45:38,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 17:45:42,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 17:45:43,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:45,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:45:45,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 17:45:46,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:49,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:45:55,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:45:55,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:45:55,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:45:55,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:59,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:02,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=93893.33333333333, ans=0.125 2023-09-28 17:46:04,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 17:46:12,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 17:46:12,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:46:12,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:46:13,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:20,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 17:46:21,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:46:25,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:25,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:46:25,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=94026.66666666667, ans=0.125 2023-09-28 17:46:27,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:46:28,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:46:29,689 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.07 vs. limit=15.0 2023-09-28 17:46:30,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 17:46:30,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:46:30,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:46:39,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 17:46:42,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:46:47,431 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.81 vs. limit=10.0 2023-09-28 17:46:49,043 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=6.52 vs. limit=15.0 2023-09-28 17:46:49,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:46:49,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:51,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=94093.33333333333, ans=0.125 2023-09-28 17:46:52,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:46:56,386 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.19 vs. limit=15.0 2023-09-28 17:46:57,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:57,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:57,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:46:58,896 INFO [train.py:1039] (0/4) Epoch 3, batch 3500, loss[loss=0.2875, simple_loss=0.3285, pruned_loss=0.1232, over 23299.00 frames. ], tot_loss[loss=0.2848, simple_loss=0.333, pruned_loss=0.1183, over 4723031.81 frames. ], batch size: 105, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:46:58,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:47:02,782 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.76 vs. limit=12.0 2023-09-28 17:47:03,553 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.532e+02 3.066e+02 3.931e+02 6.870e+02, threshold=6.132e+02, percent-clipped=2.0 2023-09-28 17:47:03,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:07,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:47:07,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 17:47:09,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:47:14,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 17:47:15,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:15,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 17:47:23,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:47:23,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:47:25,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:47:25,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:25,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:47:25,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:26,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:26,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 17:47:26,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=94226.66666666667, ans=0.125 2023-09-28 17:47:29,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:29,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:47:29,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:33,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:34,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 17:47:36,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:39,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:41,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:47:43,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:43,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=94293.33333333333, ans=0.0 2023-09-28 17:47:45,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:47:45,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:45,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 17:47:46,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 17:47:48,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 17:47:49,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:50,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:52,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:53,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:47:56,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:47:56,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:48:02,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:03,546 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.78 vs. limit=15.0 2023-09-28 17:48:04,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 17:48:04,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 17:48:04,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:06,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:07,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:09,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:12,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 17:48:12,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:14,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:48:16,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 17:48:18,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 17:48:18,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=94426.66666666667, ans=0.1 2023-09-28 17:48:21,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:22,659 INFO [train.py:1039] (0/4) Epoch 3, batch 3550, loss[loss=0.2676, simple_loss=0.3312, pruned_loss=0.1019, over 24341.00 frames. ], tot_loss[loss=0.283, simple_loss=0.3308, pruned_loss=0.1176, over 4718371.22 frames. ], batch size: 74, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:48:22,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:22,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:24,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:26,770 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=94493.33333333333, ans=0.125 2023-09-28 17:48:27,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:48:39,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:42,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:48:43,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:45,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:48:46,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:49,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:48:49,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:48:52,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:52,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:48:52,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:52,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:48:53,126 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.95 vs. limit=22.5 2023-09-28 17:48:54,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:48:59,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:48:59,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:49:02,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:02,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:49:04,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:49:04,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 17:49:04,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:04,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:06,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:49:12,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:14,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:49:14,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:16,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 17:49:17,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:49:18,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=94693.33333333333, ans=0.125 2023-09-28 17:49:19,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 17:49:21,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:22,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:49:24,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:49:26,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 17:49:27,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:33,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=94760.0, ans=0.125 2023-09-28 17:49:34,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:35,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 17:49:36,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:40,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=94760.0, ans=0.125 2023-09-28 17:49:43,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:44,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 17:49:46,245 INFO [train.py:1039] (0/4) Epoch 3, batch 3600, loss[loss=0.2707, simple_loss=0.3362, pruned_loss=0.1026, over 24358.00 frames. ], tot_loss[loss=0.2811, simple_loss=0.3297, pruned_loss=0.1162, over 4715850.18 frames. ], batch size: 74, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:49:50,960 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.527e+02 2.760e+02 3.413e+02 5.643e+02, threshold=5.521e+02, percent-clipped=0.0 2023-09-28 17:49:51,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 17:49:52,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:49:52,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=94826.66666666667, ans=0.125 2023-09-28 17:49:54,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:49:55,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:55,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:57,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:50:00,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:02,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:02,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:50:04,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:50:04,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:04,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 17:50:09,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:50:09,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=94893.33333333333, ans=0.1 2023-09-28 17:50:10,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:14,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:17,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:19,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:50:19,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:19,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 17:50:20,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:21,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:22,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:50:25,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:27,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:29,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:50:30,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 17:50:36,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:50:37,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:50:37,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 17:50:42,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:50:48,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:53,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:58,431 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:50:59,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:50:59,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:50:59,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 17:51:01,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 17:51:01,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 17:51:03,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:51:04,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:51:06,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 17:51:06,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:07,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:51:07,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:08,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 17:51:09,432 INFO [train.py:1039] (0/4) Epoch 3, batch 3650, loss[loss=0.343, simple_loss=0.3608, pruned_loss=0.1626, over 22648.00 frames. ], tot_loss[loss=0.2831, simple_loss=0.3313, pruned_loss=0.1175, over 4708697.52 frames. ], batch size: 322, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:51:09,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 17:51:12,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:51:14,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 17:51:14,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=95160.0, ans=0.125 2023-09-28 17:51:19,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 17:51:21,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:51:24,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 17:51:25,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 17:51:26,016 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.56 vs. limit=15.0 2023-09-28 17:51:29,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:51:29,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:51:29,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:51:30,515 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.70 vs. limit=15.0 2023-09-28 17:51:32,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:51:34,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:34,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 17:51:36,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:51:36,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=95226.66666666667, ans=0.0 2023-09-28 17:51:37,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:37,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 17:51:38,241 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=95226.66666666667, ans=0.125 2023-09-28 17:51:39,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:51:40,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:51:40,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:41,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:51:44,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 17:51:44,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 17:51:45,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:51:48,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 17:51:49,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:51:49,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:51:56,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:51:57,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:57,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:51:57,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=95360.0, ans=0.125 2023-09-28 17:51:59,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:52:01,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:52:01,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=95360.0, ans=0.125 2023-09-28 17:52:03,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:52:06,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:08,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:08,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:52:10,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:52:12,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:52:12,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:14,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=95426.66666666667, ans=0.2 2023-09-28 17:52:18,668 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 17:52:23,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:23,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:25,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:52:25,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:25,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=95426.66666666667, ans=0.0 2023-09-28 17:52:26,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:52:28,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:30,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 17:52:30,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:31,706 INFO [train.py:1039] (0/4) Epoch 3, batch 3700, loss[loss=0.2789, simple_loss=0.3299, pruned_loss=0.114, over 24698.00 frames. ], tot_loss[loss=0.2828, simple_loss=0.3316, pruned_loss=0.117, over 4718816.37 frames. ], batch size: 65, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:52:33,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:52:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:37,004 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.521e+02 2.916e+02 3.663e+02 5.180e+02, threshold=5.833e+02, percent-clipped=0.0 2023-09-28 17:52:37,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:52:38,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:38,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 17:52:38,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:40,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:52:40,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:52:43,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:52:46,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:48,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:49,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:52:49,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:51,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:52:52,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:55,031 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 17:53:01,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:53:01,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:53:03,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:53:04,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 17:53:04,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:04,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=95626.66666666667, ans=0.1 2023-09-28 17:53:08,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:09,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 17:53:13,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:13,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:53:16,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:18,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:53:19,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=95626.66666666667, ans=0.0 2023-09-28 17:53:21,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:53:26,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:26,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 17:53:27,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:53:27,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 17:53:29,581 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:53:31,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=95693.33333333333, ans=0.0 2023-09-28 17:53:33,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:53:33,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:53:36,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:36,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 17:53:39,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:53:39,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:53:39,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:39,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:43,134 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=95760.0, ans=0.125 2023-09-28 17:53:45,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:45,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 17:53:47,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 17:53:47,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:53:48,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:53:49,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:53:51,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:53:54,536 INFO [train.py:1039] (0/4) Epoch 3, batch 3750, loss[loss=0.2817, simple_loss=0.3462, pruned_loss=0.1085, over 24347.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3324, pruned_loss=0.117, over 4731958.81 frames. ], batch size: 77, lr: 2.85e-02, grad_scale: 32.0 2023-09-28 17:53:54,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:54,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:53:57,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:00,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 17:54:00,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 17:54:01,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=95826.66666666667, ans=0.125 2023-09-28 17:54:03,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:54:03,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 17:54:05,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:54:07,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:07,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=95826.66666666667, ans=0.125 2023-09-28 17:54:07,717 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=95826.66666666667, ans=0.125 2023-09-28 17:54:08,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:11,200 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=12.0 2023-09-28 17:54:11,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:14,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:18,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:54:18,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:54:20,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:54:22,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=95893.33333333333, ans=0.0 2023-09-28 17:54:23,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:23,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 17:54:25,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:27,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:27,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:29,740 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-09-28 17:54:30,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 17:54:35,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 17:54:37,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:37,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:40,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:44,667 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.40 vs. limit=22.5 2023-09-28 17:54:45,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:45,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:54:50,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 17:54:52,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:57,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:57,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:55:00,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:55:04,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:55:06,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:55:09,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:55:10,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:55:11,445 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.00 vs. limit=6.0 2023-09-28 17:55:14,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:55:17,121 INFO [train.py:1039] (0/4) Epoch 3, batch 3800, loss[loss=0.2692, simple_loss=0.3397, pruned_loss=0.09938, over 24487.00 frames. ], tot_loss[loss=0.285, simple_loss=0.334, pruned_loss=0.1181, over 4725662.44 frames. ], batch size: 69, lr: 2.85e-02, grad_scale: 16.0 2023-09-28 17:55:23,802 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.013e+02 2.428e+02 2.901e+02 3.496e+02 5.183e+02, threshold=5.803e+02, percent-clipped=0.0 2023-09-28 17:55:23,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:55:26,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:27,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:55:27,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 17:55:27,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=96160.0, ans=0.125 2023-09-28 17:55:28,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:30,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:32,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:55:33,364 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.75 vs. limit=15.0 2023-09-28 17:55:33,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:55:33,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:36,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:55:37,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:37,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:55:37,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:39,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 17:55:43,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:55:43,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:55:47,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:49,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:55:49,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:55:52,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:55:52,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:54,632 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=96293.33333333333, ans=0.0 2023-09-28 17:55:56,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:57,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:56:03,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:56:03,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 17:56:05,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:05,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=96360.0, ans=0.1 2023-09-28 17:56:12,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:16,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:56:20,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 17:56:22,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 17:56:23,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:25,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:25,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:26,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 17:56:30,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 17:56:31,224 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.38 vs. limit=10.0 2023-09-28 17:56:31,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 17:56:31,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:33,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:36,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=96426.66666666667, ans=0.125 2023-09-28 17:56:39,351 INFO [train.py:1039] (0/4) Epoch 3, batch 3850, loss[loss=0.2765, simple_loss=0.3001, pruned_loss=0.1264, over 23457.00 frames. ], tot_loss[loss=0.2836, simple_loss=0.3322, pruned_loss=0.1175, over 4725169.58 frames. ], batch size: 285, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:56:40,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:56:41,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:56:46,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:56:47,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 17:56:48,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:56:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:53,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:56:55,619 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=96560.0, ans=0.125 2023-09-28 17:56:55,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=96560.0, ans=0.0 2023-09-28 17:56:58,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:59,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:57:01,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 17:57:08,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:09,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:57:11,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:13,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:57:16,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:18,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:57:19,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:20,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:57:20,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:21,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:57:21,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 17:57:23,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 17:57:23,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:23,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:26,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:26,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:26,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 17:57:30,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 17:57:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:33,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 17:57:36,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:57:42,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:43,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:48,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:49,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 17:57:51,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 17:57:53,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:54,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:58,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:57:58,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:57:59,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:58:01,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 17:58:02,456 INFO [train.py:1039] (0/4) Epoch 3, batch 3900, loss[loss=0.2929, simple_loss=0.3266, pruned_loss=0.1296, over 23428.00 frames. ], tot_loss[loss=0.2823, simple_loss=0.3307, pruned_loss=0.117, over 4715925.57 frames. ], batch size: 285, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:58:02,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:58:02,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=96826.66666666667, ans=0.125 2023-09-28 17:58:04,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 17:58:04,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:04,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:07,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:58:07,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:09,131 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.471e+02 2.886e+02 3.509e+02 5.748e+02, threshold=5.772e+02, percent-clipped=0.0 2023-09-28 17:58:09,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:58:10,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:10,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:58:10,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:10,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 17:58:12,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:15,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:15,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:15,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:58:15,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=96826.66666666667, ans=0.125 2023-09-28 17:58:17,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:20,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:21,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:21,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=96893.33333333333, ans=0.07 2023-09-28 17:58:25,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:58:26,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 17:58:26,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:28,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 17:58:28,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:29,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 17:58:31,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 17:58:34,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:36,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:36,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:58:37,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:58:40,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:43,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:58:45,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:58:45,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:58:47,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:58:54,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:55,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:58:56,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=97026.66666666667, ans=0.125 2023-09-28 17:59:03,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:59:04,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=97026.66666666667, ans=0.1 2023-09-28 17:59:05,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:59:15,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:18,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:18,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 17:59:20,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 17:59:20,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:21,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 17:59:22,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=97093.33333333333, ans=0.125 2023-09-28 17:59:23,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:59:25,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 17:59:27,211 INFO [train.py:1039] (0/4) Epoch 3, batch 3950, loss[loss=0.3071, simple_loss=0.3424, pruned_loss=0.1359, over 22822.00 frames. ], tot_loss[loss=0.282, simple_loss=0.3303, pruned_loss=0.1168, over 4707487.87 frames. ], batch size: 322, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:59:33,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:59:34,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 17:59:35,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:59:38,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:59:39,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:59:44,566 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 17:59:45,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:46,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 17:59:47,490 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 17:59:47,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:47,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=97226.66666666667, ans=0.125 2023-09-28 17:59:51,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:52,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:59:52,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:55,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 17:59:57,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:59:57,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:57,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:59:57,617 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=97226.66666666667, ans=0.0 2023-09-28 17:59:58,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:59:58,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:00:12,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:00:12,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:00:17,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 18:00:21,329 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.06 vs. limit=15.0 2023-09-28 18:00:23,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 18:00:23,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 18:00:23,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:00:25,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:00:33,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:00:33,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:00:33,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:00:33,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:00:33,836 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=97426.66666666667, ans=0.0 2023-09-28 18:00:35,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 18:00:41,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:00:42,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:00:46,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 18:00:49,615 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:00:50,722 INFO [train.py:1039] (0/4) Epoch 3, batch 4000, loss[loss=0.2957, simple_loss=0.3479, pruned_loss=0.1217, over 24392.00 frames. ], tot_loss[loss=0.2827, simple_loss=0.3312, pruned_loss=0.1171, over 4712744.08 frames. ], batch size: 77, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:00:55,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:00:56,951 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.653e+02 3.032e+02 3.720e+02 5.555e+02, threshold=6.065e+02, percent-clipped=0.0 2023-09-28 18:01:03,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:09,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:09,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:01:10,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:10,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 18:01:12,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:01:12,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 18:01:12,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:01:14,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 18:01:16,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:19,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:01:20,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:01:20,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:01:20,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:20,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:01:22,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:01:24,496 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 18:01:25,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:01:27,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:29,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=97626.66666666667, ans=0.125 2023-09-28 18:01:30,359 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 18:01:31,077 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.56 vs. limit=15.0 2023-09-28 18:01:31,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:01:31,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:38,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 18:01:38,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:41,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:01:41,644 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 18:01:43,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:01:43,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 18:01:43,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:01:45,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:47,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:01:49,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:01:49,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:01:49,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:49,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 18:01:50,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:52,574 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 18:01:58,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:02:03,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 18:02:05,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:02:06,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:06,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:02:08,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:12,363 INFO [train.py:1039] (0/4) Epoch 3, batch 4050, loss[loss=0.2772, simple_loss=0.3289, pruned_loss=0.1127, over 24054.00 frames. ], tot_loss[loss=0.2818, simple_loss=0.3313, pruned_loss=0.1161, over 4715427.42 frames. ], batch size: 86, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:02:16,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:18,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=97826.66666666667, ans=0.125 2023-09-28 18:02:19,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:02:19,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 18:02:19,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=97826.66666666667, ans=0.125 2023-09-28 18:02:20,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:02:22,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:02:24,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:02:24,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:26,291 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=97826.66666666667, ans=0.0 2023-09-28 18:02:27,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:30,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:31,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:02:32,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 18:02:32,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=97893.33333333333, ans=0.125 2023-09-28 18:02:34,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:02:35,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:02:39,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:39,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=97893.33333333333, ans=0.0 2023-09-28 18:02:40,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:43,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:02:45,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 18:02:45,665 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 18:02:50,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:02:52,223 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=97960.0, ans=0.0 2023-09-28 18:02:57,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 18:02:59,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:03:01,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:04,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:03:05,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:03:05,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:09,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:03:12,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 18:03:12,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:03:13,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:15,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 18:03:18,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:25,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 18:03:27,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:03:27,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:03:28,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 18:03:30,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 18:03:30,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:32,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:03:33,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:33,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:03:35,372 INFO [train.py:1039] (0/4) Epoch 3, batch 4100, loss[loss=0.3226, simple_loss=0.3537, pruned_loss=0.1458, over 23886.00 frames. ], tot_loss[loss=0.285, simple_loss=0.3334, pruned_loss=0.1183, over 4697898.48 frames. ], batch size: 195, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:03:35,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=98160.0, ans=0.125 2023-09-28 18:03:42,054 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.855e+02 2.385e+02 2.703e+02 3.359e+02 5.329e+02, threshold=5.406e+02, percent-clipped=0.0 2023-09-28 18:03:43,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 18:03:45,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 18:03:48,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 18:03:49,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 18:03:49,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:49,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:49,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:51,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:03:52,739 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 18:03:55,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:03:56,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:03:58,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:58,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:04:02,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:04:02,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:04:02,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:04:04,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 18:04:05,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:05,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:04:05,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:05,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:04:06,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 18:04:08,617 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=98293.33333333333, ans=0.125 2023-09-28 18:04:09,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:11,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 18:04:12,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:04:16,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:16,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 18:04:18,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:04:18,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:04:18,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:04:18,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=98293.33333333333, ans=0.07 2023-09-28 18:04:19,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 18:04:22,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:04:24,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:04:25,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 18:04:27,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:27,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:29,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:35,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:04:36,564 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.89 vs. limit=15.0 2023-09-28 18:04:39,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:39,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:04:48,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:04:48,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:51,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:53,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:04:57,940 INFO [train.py:1039] (0/4) Epoch 3, batch 4150, loss[loss=0.2749, simple_loss=0.3064, pruned_loss=0.1217, over 23360.00 frames. ], tot_loss[loss=0.2846, simple_loss=0.3327, pruned_loss=0.1182, over 4696317.20 frames. ], batch size: 285, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:04:58,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:59,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:04:59,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:04:59,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:04,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 18:05:04,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:04,573 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:05:06,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 18:05:07,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 18:05:08,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 18:05:10,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:11,038 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=98493.33333333333, ans=0.2 2023-09-28 18:05:15,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:05:15,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:18,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:19,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:05:19,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:05:21,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:05:21,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:23,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:05:27,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:29,526 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=98626.66666666667, ans=0.0 2023-09-28 18:05:32,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:33,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 18:05:35,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 18:05:36,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:05:36,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 18:05:36,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:05:36,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:05:40,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=98626.66666666667, ans=0.1 2023-09-28 18:05:42,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:05:42,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:46,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 18:05:50,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:05:51,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:05:52,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=98693.33333333333, ans=0.125 2023-09-28 18:05:53,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 18:05:54,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:57,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 18:05:57,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:05:58,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:06:00,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:01,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 18:06:01,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:01,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:06:03,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:06:06,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 18:06:06,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:06,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:06:06,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:06:08,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 18:06:08,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:06:08,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 18:06:09,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:06:11,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:11,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 18:06:13,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:06:17,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:06:19,356 INFO [train.py:1039] (0/4) Epoch 3, batch 4200, loss[loss=0.2675, simple_loss=0.2926, pruned_loss=0.1212, over 22671.00 frames. ], tot_loss[loss=0.2837, simple_loss=0.3311, pruned_loss=0.1181, over 4678256.10 frames. ], batch size: 322, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:06:19,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 18:06:19,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:06:22,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:25,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:06:25,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:25,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:26,709 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.537e+02 2.926e+02 3.391e+02 4.648e+02, threshold=5.852e+02, percent-clipped=0.0 2023-09-28 18:06:26,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 18:06:27,390 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=98826.66666666667, ans=0.1 2023-09-28 18:06:28,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 18:06:30,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:33,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:35,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:06:36,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=98893.33333333333, ans=0.0 2023-09-28 18:06:37,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:06:39,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:06:40,077 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.01 vs. limit=15.0 2023-09-28 18:06:40,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:42,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 18:06:42,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:44,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:44,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:44,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:06:45,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:06:45,899 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=98893.33333333333, ans=0.5 2023-09-28 18:06:48,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 18:06:48,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:56,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:06:57,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:06:59,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:07:02,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:05,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:07:05,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 18:07:05,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:07,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:07:12,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:07:12,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=99026.66666666667, ans=0.125 2023-09-28 18:07:13,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:20,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:07:21,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 18:07:25,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:30,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:07:30,939 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.12 vs. limit=15.0 2023-09-28 18:07:31,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:33,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 18:07:37,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=99093.33333333333, ans=0.125 2023-09-28 18:07:40,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:07:41,854 INFO [train.py:1039] (0/4) Epoch 3, batch 4250, loss[loss=0.2962, simple_loss=0.3332, pruned_loss=0.1296, over 23793.00 frames. ], tot_loss[loss=0.2819, simple_loss=0.3297, pruned_loss=0.1171, over 4685382.22 frames. ], batch size: 164, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:07:45,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:45,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:07:45,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=99160.0, ans=0.125 2023-09-28 18:07:45,986 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.92 vs. limit=15.0 2023-09-28 18:07:46,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:51,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:07:52,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 18:07:52,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:56,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:00,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:05,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:05,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:08,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:08:08,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:08,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:10,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:12,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:15,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:08:16,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:17,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=99293.33333333333, ans=0.125 2023-09-28 18:08:18,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 18:08:21,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 18:08:21,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:22,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:22,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:24,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:08:24,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:24,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:27,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:08:27,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:08:28,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=99293.33333333333, ans=0.0 2023-09-28 18:08:33,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:08:35,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:35,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 18:08:35,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:08:37,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 18:08:39,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:08:41,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:08:44,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:44,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:46,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 18:08:47,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:08:48,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:08:52,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:55,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:57,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:08:58,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:02,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:03,680 INFO [train.py:1039] (0/4) Epoch 3, batch 4300, loss[loss=0.3076, simple_loss=0.3634, pruned_loss=0.1259, over 24557.00 frames. ], tot_loss[loss=0.2796, simple_loss=0.3282, pruned_loss=0.1155, over 4696396.43 frames. ], batch size: 71, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:09:03,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:09:05,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:05,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 18:09:05,517 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:09:06,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:12,364 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.893e+02 2.623e+02 3.036e+02 3.611e+02 5.200e+02, threshold=6.071e+02, percent-clipped=0.0 2023-09-28 18:09:12,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:12,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:17,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:23,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:09:23,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 18:09:25,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:09:28,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:09:28,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:09:28,510 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 18:09:31,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:09:33,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:09:36,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 18:09:36,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:09:36,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 18:09:40,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:09:41,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:09:47,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:09:47,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:47,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:09:48,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:50,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:50,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 18:09:50,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=99626.66666666667, ans=0.125 2023-09-28 18:09:51,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 18:09:53,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:55,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:55,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:09:55,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:56,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:56,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 18:09:56,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 18:09:56,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 18:09:58,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:09:58,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 18:10:00,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 18:10:00,722 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=99693.33333333333, ans=0.0 2023-09-28 18:10:03,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:03,553 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 18:10:04,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:10:06,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:06,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:10,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 18:10:10,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:10:10,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:10,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:10,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:10,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:10:12,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:10:15,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:15,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=99760.0, ans=0.125 2023-09-28 18:10:16,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:16,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:23,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 18:10:23,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:10:26,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn2.whiten.whitening_limit, batch_count=99826.66666666667, ans=22.5 2023-09-28 18:10:26,393 INFO [train.py:1039] (0/4) Epoch 3, batch 4350, loss[loss=0.2925, simple_loss=0.3324, pruned_loss=0.1263, over 23493.00 frames. ], tot_loss[loss=0.2795, simple_loss=0.3287, pruned_loss=0.1152, over 4716614.99 frames. ], batch size: 285, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:10:29,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:10:31,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:34,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:10:34,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:10:40,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:10:44,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:47,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:10:47,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:48,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:10:53,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:10:55,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:11:01,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 18:11:02,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:02,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:08,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:10,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 18:11:15,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:18,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:11:20,796 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 18:11:20,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:21,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=100026.66666666667, ans=0.0 2023-09-28 18:11:22,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:11:24,110 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 18:11:24,219 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 18:11:24,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:24,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:25,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:11:27,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:29,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:29,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:11:32,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 18:11:32,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:32,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:33,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:33,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 18:11:35,383 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 18:11:35,390 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 18:11:35,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 18:11:38,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:11:38,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:11:39,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:11:39,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:11:41,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 18:11:45,061 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 18:11:45,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:49,525 INFO [train.py:1039] (0/4) Epoch 3, batch 4400, loss[loss=0.2799, simple_loss=0.3246, pruned_loss=0.1176, over 23662.00 frames. ], tot_loss[loss=0.2808, simple_loss=0.3301, pruned_loss=0.1157, over 4723974.59 frames. ], batch size: 164, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:11:49,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:11:49,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:51,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:56,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 18:11:56,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 18:11:56,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 18:11:56,291 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 18:11:57,544 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.556e+02 3.170e+02 3.495e+02 5.491e+02, threshold=6.340e+02, percent-clipped=0.0 2023-09-28 18:11:57,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:11:57,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:12:01,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 18:12:01,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:04,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:04,490 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 18:12:06,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=100226.66666666667, ans=0.125 2023-09-28 18:12:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:07,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 18:12:07,641 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 18:12:10,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 18:12:12,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 18:12:12,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 18:12:12,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:13,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:16,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 18:12:16,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 18:12:17,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:20,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:12:20,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:20,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:21,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:21,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 18:12:23,879 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 18:12:29,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:37,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:40,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 18:12:44,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:12:47,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:12:49,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:12:50,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 18:12:51,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:12:51,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:12:51,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:12:52,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:12:57,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 18:13:02,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 18:13:02,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 18:13:02,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:04,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 18:13:04,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:13:07,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:13:09,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 18:13:10,893 INFO [train.py:1039] (0/4) Epoch 3, batch 4450, loss[loss=0.2788, simple_loss=0.3395, pruned_loss=0.1091, over 24102.00 frames. ], tot_loss[loss=0.2821, simple_loss=0.3314, pruned_loss=0.1164, over 4720997.02 frames. ], batch size: 80, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:13:12,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:13:16,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:16,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:13:16,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=100493.33333333333, ans=0.0 2023-09-28 18:13:23,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:23,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:13:25,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=100560.0, ans=15.0 2023-09-28 18:13:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:28,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:13:28,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:13:28,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:30,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 18:13:30,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:32,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:32,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:13:32,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:13:35,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:13:35,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=100560.0, ans=0.125 2023-09-28 18:13:42,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:43,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:45,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:45,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:47,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:13:49,891 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.80 vs. limit=12.0 2023-09-28 18:13:52,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:13:52,594 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=100626.66666666667, ans=0.2 2023-09-28 18:13:53,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 18:13:53,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 18:13:53,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:13:55,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:57,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 18:14:01,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:14:04,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:04,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 18:14:04,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:04,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:04,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:14:04,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:14:07,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:10,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:14:10,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 18:14:13,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:14:14,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:14:17,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:18,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=100760.0, ans=0.0 2023-09-28 18:14:19,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:19,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:14:21,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:14:26,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 18:14:27,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:14:29,734 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=16.03 vs. limit=15.0 2023-09-28 18:14:31,869 INFO [train.py:1039] (0/4) Epoch 3, batch 4500, loss[loss=0.2628, simple_loss=0.2988, pruned_loss=0.1134, over 23448.00 frames. ], tot_loss[loss=0.2812, simple_loss=0.3307, pruned_loss=0.1159, over 4720747.49 frames. ], batch size: 285, lr: 2.79e-02, grad_scale: 32.0 2023-09-28 18:14:33,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:34,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 18:14:34,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 18:14:36,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:37,283 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.20 vs. limit=10.0 2023-09-28 18:14:40,300 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.564e+02 2.888e+02 3.333e+02 4.958e+02, threshold=5.777e+02, percent-clipped=0.0 2023-09-28 18:14:40,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:42,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:42,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:14:44,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:14:44,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:45,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:51,769 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=100893.33333333333, ans=0.125 2023-09-28 18:14:59,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:59,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:15:03,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:04,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:15:05,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:15:07,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=100960.0, ans=0.1 2023-09-28 18:15:12,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:15:17,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:15:21,042 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.20 vs. limit=10.0 2023-09-28 18:15:21,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:15:26,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:15:26,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 18:15:26,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:27,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:29,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:29,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:32,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:15:32,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 18:15:32,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:15:34,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:37,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:15:38,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:15:40,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:43,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:15:43,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:15:45,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 18:15:48,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 18:15:48,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 18:15:53,572 INFO [train.py:1039] (0/4) Epoch 3, batch 4550, loss[loss=0.2718, simple_loss=0.3012, pruned_loss=0.1211, over 23524.00 frames. ], tot_loss[loss=0.28, simple_loss=0.329, pruned_loss=0.1154, over 4716153.08 frames. ], batch size: 256, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:15:53,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 18:15:55,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 18:15:56,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:15:58,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:15:58,771 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=101160.0, ans=0.125 2023-09-28 18:15:59,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:16:03,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:08,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:16:10,520 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.20 vs. limit=15.0 2023-09-28 18:16:11,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:16:12,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:12,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:16:12,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:13,164 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=101226.66666666667, ans=0.125 2023-09-28 18:16:13,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=101226.66666666667, ans=0.0 2023-09-28 18:16:15,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:15,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:16:19,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:22,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 18:16:22,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 18:16:24,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:16:25,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=101293.33333333333, ans=0.0 2023-09-28 18:16:26,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 18:16:30,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 18:16:30,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:34,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 18:16:36,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:16:39,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:16:42,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 18:16:43,417 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.30 vs. limit=15.0 2023-09-28 18:16:44,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:16:45,306 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.88 vs. limit=15.0 2023-09-28 18:16:47,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:47,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:48,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:50,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 18:16:52,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 18:16:52,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:16:53,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 18:16:57,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 18:16:57,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:58,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:59,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:59,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:59,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:17:01,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:17:01,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 18:17:04,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:17:04,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:17:05,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 18:17:05,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:17:05,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 18:17:06,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=101426.66666666667, ans=0.5 2023-09-28 18:17:06,229 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=101426.66666666667, ans=0.2 2023-09-28 18:17:08,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:17:08,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:17:10,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:17:10,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:17:10,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:17:12,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:17:15,554 INFO [train.py:1039] (0/4) Epoch 3, batch 4600, loss[loss=0.2831, simple_loss=0.3288, pruned_loss=0.1187, over 23483.00 frames. ], tot_loss[loss=0.2792, simple_loss=0.3276, pruned_loss=0.1155, over 4707736.34 frames. ], batch size: 120, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:17:15,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:17:17,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:18,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=101493.33333333333, ans=10.0 2023-09-28 18:17:20,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:17:23,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:17:23,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:17:23,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:24,711 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.433e+02 2.837e+02 3.221e+02 4.908e+02, threshold=5.674e+02, percent-clipped=0.0 2023-09-28 18:17:24,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 18:17:27,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:17:32,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:17:34,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:37,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:42,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 18:17:43,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:45,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:49,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:17:49,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:53,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=101626.66666666667, ans=0.0 2023-09-28 18:17:55,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 18:17:55,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:17:55,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:02,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:03,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:18:05,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:18:09,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 18:18:10,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:18:15,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:16,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:18:18,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:18,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 18:18:18,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:19,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 18:18:20,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:20,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:21,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:23,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:18:25,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:25,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 18:18:25,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 18:18:26,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 18:18:26,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:28,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:29,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:29,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:38,789 INFO [train.py:1039] (0/4) Epoch 3, batch 4650, loss[loss=0.2871, simple_loss=0.336, pruned_loss=0.1191, over 23415.00 frames. ], tot_loss[loss=0.2786, simple_loss=0.3277, pruned_loss=0.1148, over 4718432.06 frames. ], batch size: 119, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:18:42,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:18:45,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:45,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:46,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:18:47,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:47,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:48,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:52,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 18:18:54,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=101893.33333333333, ans=0.2 2023-09-28 18:18:56,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:18:58,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 18:18:58,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:59,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 18:19:00,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:19:01,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 18:19:01,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 18:19:02,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:02,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:19:04,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=101893.33333333333, ans=0.125 2023-09-28 18:19:05,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:19:07,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:07,449 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 18:19:09,938 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.32 vs. limit=15.0 2023-09-28 18:19:10,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:12,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 18:19:16,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:16,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:19:17,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 18:19:19,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:19:22,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:19:26,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:30,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:34,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:34,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:35,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:19:38,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 18:19:40,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 18:19:41,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 18:19:41,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 18:19:43,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:19:47,681 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.79 vs. limit=15.0 2023-09-28 18:19:52,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:19:52,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:19:52,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 18:19:52,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:53,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:53,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:19:56,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:19:57,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:19:57,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:57,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:57,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=102093.33333333333, ans=0.0 2023-09-28 18:20:00,520 INFO [train.py:1039] (0/4) Epoch 3, batch 4700, loss[loss=0.2453, simple_loss=0.3007, pruned_loss=0.09496, over 24319.00 frames. ], tot_loss[loss=0.2792, simple_loss=0.3288, pruned_loss=0.1148, over 4736620.68 frames. ], batch size: 56, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:20:03,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:05,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:20:05,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:20:05,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:20:06,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:20:06,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 18:20:10,641 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.707e+02 3.161e+02 3.958e+02 7.246e+02, threshold=6.322e+02, percent-clipped=4.0 2023-09-28 18:20:14,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:14,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:14,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:20:15,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:17,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:20:17,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=102226.66666666667, ans=0.125 2023-09-28 18:20:24,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 18:20:24,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 18:20:25,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:29,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:20:29,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:20:31,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=102226.66666666667, ans=0.125 2023-09-28 18:20:32,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:39,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:20:41,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:20:42,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:47,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=102293.33333333333, ans=22.5 2023-09-28 18:20:48,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 18:20:50,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:20:53,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:20:56,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=102360.0, ans=0.0 2023-09-28 18:20:57,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 18:20:59,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:04,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:21:04,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 18:21:05,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:06,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:09,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:21:10,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:21:10,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 18:21:12,083 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 18:21:13,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:15,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 18:21:18,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:21,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 18:21:23,340 INFO [train.py:1039] (0/4) Epoch 3, batch 4750, loss[loss=0.2798, simple_loss=0.3442, pruned_loss=0.1077, over 24549.00 frames. ], tot_loss[loss=0.2817, simple_loss=0.3308, pruned_loss=0.1163, over 4704012.47 frames. ], batch size: 71, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:21:23,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:21:24,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:28,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=102493.33333333333, ans=0.125 2023-09-28 18:21:30,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:30,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:21:33,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 18:21:33,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:21:35,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 18:21:38,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:21:38,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:39,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:45,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 18:21:50,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:21:52,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 18:21:53,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:59,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:59,691 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 18:21:59,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 18:22:01,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=102626.66666666667, ans=0.125 2023-09-28 18:22:05,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 18:22:08,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:10,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:13,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:22:13,923 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 18:22:13,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:15,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:22:18,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:22:20,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 18:22:20,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 18:22:20,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:22:21,020 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.55 vs. limit=15.0 2023-09-28 18:22:21,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:22:21,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:23,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:22:23,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 18:22:25,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 18:22:28,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:22:31,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:22:31,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 18:22:32,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=102760.0, ans=0.2 2023-09-28 18:22:33,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:22:33,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:34,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:22:36,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:37,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:22:41,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:41,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 18:22:43,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 18:22:44,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 18:22:46,133 INFO [train.py:1039] (0/4) Epoch 3, batch 4800, loss[loss=0.2704, simple_loss=0.3339, pruned_loss=0.1034, over 24377.00 frames. ], tot_loss[loss=0.2824, simple_loss=0.3314, pruned_loss=0.1167, over 4699065.65 frames. ], batch size: 77, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:22:48,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:22:48,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:49,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 18:22:50,689 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.41 vs. limit=15.0 2023-09-28 18:22:55,850 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.876e+02 2.499e+02 2.983e+02 3.709e+02 7.262e+02, threshold=5.966e+02, percent-clipped=1.0 2023-09-28 18:22:55,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:56,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:02,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:23:04,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:04,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:04,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 18:23:05,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:23:07,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:23:07,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:23:07,783 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=102893.33333333333, ans=0.0 2023-09-28 18:23:13,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:17,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:23:17,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:23:17,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:19,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:22,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:23,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:23:26,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:23:27,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:29,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 18:23:29,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 18:23:32,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:32,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:23:32,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:23:32,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:34,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:23:35,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:23:35,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:39,535 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.60 vs. limit=15.0 2023-09-28 18:23:40,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:41,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:44,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:23:48,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 18:23:49,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:49,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:49,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:23:52,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:55,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:57,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:23:57,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:57,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:23:58,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:24:00,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:24:03,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:03,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:03,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:24:05,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 18:24:08,451 INFO [train.py:1039] (0/4) Epoch 3, batch 4850, loss[loss=0.2815, simple_loss=0.3251, pruned_loss=0.1189, over 23724.00 frames. ], tot_loss[loss=0.2844, simple_loss=0.3331, pruned_loss=0.1179, over 4702149.83 frames. ], batch size: 149, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:24:08,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 18:24:08,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:08,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:10,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:10,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:13,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:24:21,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 18:24:21,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:25,915 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.32 vs. limit=15.0 2023-09-28 18:24:26,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:28,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:24:28,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:32,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:33,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:24:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:24:35,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 18:24:38,468 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.21 vs. limit=15.0 2023-09-28 18:24:39,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:42,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:24:42,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:24:42,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:24:42,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 18:24:46,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:46,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:49,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:49,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 18:24:51,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 18:24:51,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:24:59,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:25:00,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 18:25:00,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:25:00,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:25:03,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:25:05,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 18:25:05,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:08,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 18:25:08,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:09,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:11,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 18:25:20,316 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.08 vs. limit=15.0 2023-09-28 18:25:22,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:24,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=103426.66666666667, ans=0.0 2023-09-28 18:25:28,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:25:28,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:29,922 INFO [train.py:1039] (0/4) Epoch 3, batch 4900, loss[loss=0.2507, simple_loss=0.3119, pruned_loss=0.09474, over 24415.00 frames. ], tot_loss[loss=0.2821, simple_loss=0.3312, pruned_loss=0.1165, over 4706952.25 frames. ], batch size: 58, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:25:35,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 18:25:35,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:25:35,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=103493.33333333333, ans=0.125 2023-09-28 18:25:40,617 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.465e+02 2.992e+02 4.302e+02 8.236e+02, threshold=5.984e+02, percent-clipped=6.0 2023-09-28 18:25:40,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:42,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:42,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:25:45,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 18:25:49,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=103560.0, ans=0.0 2023-09-28 18:25:50,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 18:25:54,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 18:25:55,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 18:25:55,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:25:56,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=103560.0, ans=0.125 2023-09-28 18:25:57,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:57,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:25:57,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:57,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:25:57,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 18:26:00,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 18:26:01,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:26:03,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:26:04,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:26:06,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:26:08,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:08,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:08,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 18:26:10,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:26:13,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:26:13,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 18:26:13,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 18:26:18,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 18:26:18,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=103693.33333333333, ans=0.1 2023-09-28 18:26:20,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:26:21,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:26:23,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:26:23,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:23,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:26:23,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:26:24,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 18:26:27,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:30,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:26:30,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:26:31,170 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=103693.33333333333, ans=0.125 2023-09-28 18:26:34,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 18:26:35,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:26:35,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:26:35,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 18:26:36,720 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.27 vs. limit=22.5 2023-09-28 18:26:44,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:45,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:26:47,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 18:26:47,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:26:47,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:26:51,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:53,062 INFO [train.py:1039] (0/4) Epoch 3, batch 4950, loss[loss=0.2685, simple_loss=0.33, pruned_loss=0.1035, over 24484.00 frames. ], tot_loss[loss=0.2796, simple_loss=0.3288, pruned_loss=0.1152, over 4707661.70 frames. ], batch size: 63, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:26:54,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:26:54,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:26:54,949 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=103826.66666666667, ans=0.0 2023-09-28 18:26:55,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=103826.66666666667, ans=0.1 2023-09-28 18:26:56,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:56,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 18:26:56,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=103826.66666666667, ans=0.125 2023-09-28 18:26:56,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=103826.66666666667, ans=0.125 2023-09-28 18:26:57,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:26:58,560 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.80 vs. limit=6.0 2023-09-28 18:26:59,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=103826.66666666667, ans=0.125 2023-09-28 18:27:00,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:00,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:27:04,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 18:27:04,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 18:27:05,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:27:05,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 18:27:05,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:06,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:27:06,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:27:07,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:08,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:10,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:27:11,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:27:14,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:14,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:14,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:27:19,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:27:24,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:25,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:27:27,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:27,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:27,472 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=103960.0, ans=0.0 2023-09-28 18:27:28,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:27:30,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 18:27:31,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 18:27:34,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:37,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:27:37,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:27:39,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:27:39,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:27:40,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:27:42,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:44,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:27:45,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:27:50,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:50,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:52,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 18:27:52,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:27:53,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:27:56,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:27:59,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:27:59,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:28:01,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:01,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:28:02,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:28:04,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:28:04,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:28:04,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:28:05,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 18:28:09,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:13,299 INFO [train.py:1039] (0/4) Epoch 3, batch 5000, loss[loss=0.267, simple_loss=0.3336, pruned_loss=0.1002, over 24467.00 frames. ], tot_loss[loss=0.2783, simple_loss=0.3277, pruned_loss=0.1144, over 4704808.22 frames. ], batch size: 69, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:28:15,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 18:28:15,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:28:22,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:23,498 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.857e+02 2.486e+02 2.809e+02 3.764e+02 5.780e+02, threshold=5.617e+02, percent-clipped=0.0 2023-09-28 18:28:23,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:25,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 18:28:25,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 18:28:26,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:28:29,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 18:28:29,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:28:29,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:28:31,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 18:28:32,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:33,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:28:33,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 18:28:33,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:34,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:28:35,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 18:28:35,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 18:28:36,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:28:38,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 18:28:38,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:28:38,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:39,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:28:39,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 18:28:39,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 18:28:41,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 18:28:41,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:41,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:44,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 18:28:44,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:45,372 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.70 vs. limit=15.0 2023-09-28 18:28:45,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:46,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:48,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:28:50,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 18:28:50,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:28:52,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:28:57,398 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 18:29:00,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:29:00,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=104360.0, ans=0.0 2023-09-28 18:29:01,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:29:01,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:03,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 18:29:03,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:29:03,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:06,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:08,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 18:29:09,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:10,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=104360.0, ans=0.1 2023-09-28 18:29:11,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:13,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:18,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=104426.66666666667, ans=0.2 2023-09-28 18:29:19,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 18:29:23,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:33,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:34,922 INFO [train.py:1039] (0/4) Epoch 3, batch 5050, loss[loss=0.2478, simple_loss=0.3016, pruned_loss=0.09706, over 24499.00 frames. ], tot_loss[loss=0.2771, simple_loss=0.3272, pruned_loss=0.1135, over 4710395.53 frames. ], batch size: 58, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:29:35,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:35,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:29:35,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:36,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:29:36,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:29:37,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 18:29:43,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:29:45,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:47,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:29:48,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 18:29:48,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:50,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:53,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:29:53,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:29:54,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:30:04,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 18:30:04,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:30:06,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:06,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 18:30:06,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:08,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:09,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:10,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:30:10,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 18:30:11,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 18:30:13,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:16,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:19,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:20,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 18:30:21,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:24,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 18:30:27,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:30:27,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:30:27,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:29,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:32,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:30:33,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:30:35,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:35,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:30:37,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:30:37,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 18:30:38,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:30:39,087 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:30:40,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:43,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:43,631 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 18:30:43,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:30:45,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:30:45,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:45,862 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 18:30:47,792 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=104760.0, ans=0.125 2023-09-28 18:30:48,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:48,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 18:30:48,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:52,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:52,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:54,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 18:30:54,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=104760.0, ans=0.0 2023-09-28 18:30:56,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 18:30:57,719 INFO [train.py:1039] (0/4) Epoch 3, batch 5100, loss[loss=0.2965, simple_loss=0.3369, pruned_loss=0.128, over 23552.00 frames. ], tot_loss[loss=0.2777, simple_loss=0.328, pruned_loss=0.1137, over 4712607.66 frames. ], batch size: 256, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:30:57,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:57,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:30:59,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:31:02,338 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 18:31:03,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:31:06,675 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.935e+02 2.697e+02 3.242e+02 4.082e+02 8.790e+02, threshold=6.484e+02, percent-clipped=7.0 2023-09-28 18:31:06,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 18:31:08,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 18:31:09,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:10,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=104826.66666666667, ans=0.125 2023-09-28 18:31:12,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:31:15,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:31:16,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 18:31:16,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 18:31:20,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:31:20,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=104893.33333333333, ans=0.0 2023-09-28 18:31:22,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:31:25,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:25,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=104893.33333333333, ans=0.125 2023-09-28 18:31:28,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 18:31:28,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:31,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:31:31,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:31:32,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 18:31:37,123 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 18:31:37,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:37,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 18:31:37,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 18:31:41,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:51,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:31:53,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 18:31:53,842 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 18:31:55,187 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 18:31:56,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 18:31:56,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:32:00,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 18:32:04,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=105093.33333333333, ans=0.125 2023-09-28 18:32:06,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 18:32:09,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:32:11,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:32:12,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 18:32:14,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:32:14,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 18:32:18,909 INFO [train.py:1039] (0/4) Epoch 3, batch 5150, loss[loss=0.271, simple_loss=0.3191, pruned_loss=0.1114, over 23320.00 frames. ], tot_loss[loss=0.2792, simple_loss=0.3294, pruned_loss=0.1145, over 4721969.51 frames. ], batch size: 119, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:32:22,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:32:22,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:32:22,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:32:24,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:32:24,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:32:24,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:32:25,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 18:32:25,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 18:32:27,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 18:32:27,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:32:27,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 18:32:29,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:29,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:32:30,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:32,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:37,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:32:37,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 18:32:40,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:40,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:32:42,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:32:42,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:32:42,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:32:43,126 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=105226.66666666667, ans=0.125 2023-09-28 18:32:44,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:32:44,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:32:44,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 18:32:46,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:32:46,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:32:49,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:32:51,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 18:32:53,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:33:00,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:33:04,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 18:33:07,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:14,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:14,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:18,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:19,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:21,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 18:33:25,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:33:25,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:33:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:33:30,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:30,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:32,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 18:33:34,223 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=105426.66666666667, ans=0.2 2023-09-28 18:33:37,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:39,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:33:42,274 INFO [train.py:1039] (0/4) Epoch 3, batch 5200, loss[loss=0.2794, simple_loss=0.3381, pruned_loss=0.1103, over 24570.00 frames. ], tot_loss[loss=0.2816, simple_loss=0.3317, pruned_loss=0.1158, over 4717958.19 frames. ], batch size: 71, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:33:42,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:42,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:33:43,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:33:43,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:33:43,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:33:44,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:33:47,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:33:49,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:33:52,551 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.485e+02 2.931e+02 3.472e+02 7.408e+02, threshold=5.863e+02, percent-clipped=1.0 2023-09-28 18:33:52,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:55,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 18:33:57,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:33:57,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:33:58,355 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.83 vs. limit=12.0 2023-09-28 18:33:58,538 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.51 vs. limit=22.5 2023-09-28 18:34:00,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:00,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:34:01,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:02,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 18:34:02,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=105560.0, ans=0.025 2023-09-28 18:34:07,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:34:09,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:12,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 18:34:13,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:34:13,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:34:15,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 18:34:15,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 18:34:19,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 18:34:21,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:21,692 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 18:34:21,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:21,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:23,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:34:23,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 18:34:23,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=105626.66666666667, ans=0.2 2023-09-28 18:34:24,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:34:27,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:27,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=105626.66666666667, ans=0.125 2023-09-28 18:34:30,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 18:34:30,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 18:34:30,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 18:34:35,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 18:34:36,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:34:41,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:34:43,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:34:43,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 18:34:44,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:44,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 18:34:44,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:44,856 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.80 vs. limit=22.5 2023-09-28 18:34:45,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:34:48,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:34:50,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:34:54,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:55,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:34:55,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:59,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=105760.0, ans=0.1 2023-09-28 18:35:02,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:03,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 18:35:04,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:35:05,433 INFO [train.py:1039] (0/4) Epoch 3, batch 5250, loss[loss=0.2645, simple_loss=0.3282, pruned_loss=0.1004, over 24456.00 frames. ], tot_loss[loss=0.2808, simple_loss=0.3305, pruned_loss=0.1155, over 4711009.28 frames. ], batch size: 66, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:35:05,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:35:05,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:07,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:35:08,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:35:12,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:35:14,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:14,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:35:15,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:35:18,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=105826.66666666667, ans=10.0 2023-09-28 18:35:21,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:24,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:35:27,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:35:30,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:35:32,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 18:35:32,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:33,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:36,346 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.23 vs. limit=15.0 2023-09-28 18:35:58,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=106026.66666666667, ans=0.025 2023-09-28 18:36:06,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=106093.33333333333, ans=0.125 2023-09-28 18:36:20,358 INFO [train.py:1039] (0/4) Epoch 3, batch 5300, loss[loss=0.2448, simple_loss=0.3105, pruned_loss=0.08952, over 24477.00 frames. ], tot_loss[loss=0.2792, simple_loss=0.3291, pruned_loss=0.1147, over 4699211.52 frames. ], batch size: 63, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:36:23,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=106160.0, ans=0.125 2023-09-28 18:36:28,665 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 2.515e+02 2.948e+02 3.617e+02 7.012e+02, threshold=5.895e+02, percent-clipped=2.0 2023-09-28 18:36:35,505 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-3.pt 2023-09-28 18:36:42,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:36:42,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 18:36:42,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 18:36:42,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:42,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:42,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:43,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:43,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:43,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:36:43,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:43,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:36:43,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:36:43,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 18:36:44,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 18:36:44,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 18:36:44,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:36:44,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 18:36:44,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 18:36:44,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:45,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:45,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:45,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:45,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:36:45,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:45,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:45,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:46,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:46,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:46,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:36:46,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:46,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:36:47,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 18:36:47,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:48,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:48,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 18:36:48,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 18:36:48,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:36:48,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:36:48,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 18:36:48,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 18:36:48,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:49,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:36:49,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:49,648 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 18:36:49,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 18:36:49,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:36:49,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:50,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 18:36:50,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 18:36:50,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 18:36:51,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:54,495 INFO [train.py:1039] (0/4) Epoch 4, batch 0, loss[loss=0.2936, simple_loss=0.3339, pruned_loss=0.1266, over 23740.00 frames. ], tot_loss[loss=0.2936, simple_loss=0.3339, pruned_loss=0.1266, over 23740.00 frames. ], batch size: 179, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:36:54,496 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 18:37:09,543 INFO [train.py:1071] (0/4) Epoch 4, validation: loss=0.3856, simple_loss=0.3373, pruned_loss=0.217, over 1125622.00 frames. 2023-09-28 18:37:09,544 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 18:37:12,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 18:37:14,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:37:15,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:37:21,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:21,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:37:22,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:24,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 18:37:25,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 18:37:27,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:27,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:34,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:37:34,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:36,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 18:37:39,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:40,361 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.51 vs. limit=12.0 2023-09-28 18:37:46,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:37:46,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:48,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 18:37:52,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:37:52,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:37:54,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:37:58,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:38:01,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:08,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 18:38:09,574 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.87 vs. limit=12.0 2023-09-28 18:38:13,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 18:38:13,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:13,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:15,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:38:15,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:38:16,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 18:38:20,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:22,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:25,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:38:29,341 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 18:38:30,524 INFO [train.py:1039] (0/4) Epoch 4, batch 50, loss[loss=0.2376, simple_loss=0.2971, pruned_loss=0.08903, over 24346.00 frames. ], tot_loss[loss=0.2807, simple_loss=0.3306, pruned_loss=0.1154, over 1065160.28 frames. ], batch size: 56, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:38:32,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:38:34,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:37,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:37,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 18:38:39,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:38:39,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:38:40,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:42,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:44,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:48,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 18:38:48,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:51,531 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-16000.pt 2023-09-28 18:38:58,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:39:00,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 18:39:01,025 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=106640.0, ans=0.2 2023-09-28 18:39:02,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 18:39:05,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:39:07,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:07,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:09,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:09,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:39:11,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:39:11,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:12,181 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.08 vs. limit=15.0 2023-09-28 18:39:16,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:19,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:19,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:39:19,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 18:39:20,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:39:22,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:39:22,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 18:39:23,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:24,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 18:39:30,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:39:30,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:30,967 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:39:33,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:35,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:35,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:37,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 18:39:37,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 18:39:37,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:39,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:41,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:41,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:43,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 18:39:43,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 18:39:44,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:39:46,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:46,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:39:47,465 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 2.007e+02 2.562e+02 2.907e+02 3.580e+02 6.238e+02, threshold=5.814e+02, percent-clipped=1.0 2023-09-28 18:39:47,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 18:39:47,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 18:39:49,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:49,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:50,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:39:50,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:39:55,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:39:55,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=106906.66666666667, ans=0.2 2023-09-28 18:39:56,652 INFO [train.py:1039] (0/4) Epoch 4, batch 100, loss[loss=0.246, simple_loss=0.3139, pruned_loss=0.08907, over 24476.00 frames. ], tot_loss[loss=0.2792, simple_loss=0.3298, pruned_loss=0.1143, over 1879417.11 frames. ], batch size: 66, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:39:58,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:40:01,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:05,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 18:40:05,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:40:08,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:40:08,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:08,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:40:08,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:40:10,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:11,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 18:40:12,254 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=106973.33333333333, ans=0.0 2023-09-28 18:40:15,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:40:15,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:15,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:17,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:21,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 18:40:21,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:22,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:22,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:40:24,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:40:29,207 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 18:40:29,231 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 18:40:30,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:40:30,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:40:32,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=107040.0, ans=0.0 2023-09-28 18:40:35,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:40:35,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:39,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:43,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=107040.0, ans=0.025 2023-09-28 18:40:46,489 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=107106.66666666667, ans=0.5 2023-09-28 18:40:47,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:47,621 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 18:40:49,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:40:54,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:40:56,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:00,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:02,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:05,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:06,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:41:08,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=107173.33333333333, ans=0.1 2023-09-28 18:41:09,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:10,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:11,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:11,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:41:11,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:13,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 18:41:13,064 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 18:41:13,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:14,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:41:14,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:14,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:14,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:41:16,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:41:16,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:41:16,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:16,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:18,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:18,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:41:18,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:41:19,795 INFO [train.py:1039] (0/4) Epoch 4, batch 150, loss[loss=0.285, simple_loss=0.3288, pruned_loss=0.1206, over 23807.00 frames. ], tot_loss[loss=0.2782, simple_loss=0.3297, pruned_loss=0.1133, over 2517144.82 frames. ], batch size: 164, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:41:21,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:24,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:24,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:41:24,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:28,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:28,396 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=107240.0, ans=0.125 2023-09-28 18:41:30,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:31,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:41:33,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:39,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 18:41:39,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 18:41:39,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 18:41:42,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:41:42,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:41:43,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:45,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:45,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:45,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:48,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:49,733 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 18:41:51,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:58,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:00,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=107373.33333333333, ans=0.0 2023-09-28 18:42:03,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:42:03,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 18:42:06,868 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=107440.0, ans=0.125 2023-09-28 18:42:09,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:42:09,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:09,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:11,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:42:13,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:42:16,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:42:17,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:18,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 18:42:22,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:24,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:24,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:42:24,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:42:27,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:28,022 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.69 vs. limit=15.0 2023-09-28 18:42:29,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 18:42:31,540 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.741e+02 2.491e+02 2.943e+02 3.333e+02 6.261e+02, threshold=5.886e+02, percent-clipped=1.0 2023-09-28 18:42:33,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:42:34,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:42:34,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:36,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:42:36,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 18:42:36,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:36,641 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 18:42:36,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=107506.66666666667, ans=0.125 2023-09-28 18:42:36,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=107506.66666666667, ans=0.0 2023-09-28 18:42:40,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:42:41,583 INFO [train.py:1039] (0/4) Epoch 4, batch 200, loss[loss=0.3078, simple_loss=0.3436, pruned_loss=0.136, over 23748.00 frames. ], tot_loss[loss=0.2757, simple_loss=0.3283, pruned_loss=0.1116, over 3017144.55 frames. ], batch size: 212, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:42:42,470 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.95 vs. limit=15.0 2023-09-28 18:42:44,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:42:44,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:42:48,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 18:42:49,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:50,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:51,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=107573.33333333333, ans=0.0 2023-09-28 18:42:53,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 18:42:56,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:42:56,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:57,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:58,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=107640.0, ans=0.1 2023-09-28 18:43:00,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:43:02,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:43:02,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:06,874 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.67 vs. limit=6.0 2023-09-28 18:43:10,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=107640.0, ans=0.1 2023-09-28 18:43:19,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:43:19,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:43:20,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=107706.66666666667, ans=0.125 2023-09-28 18:43:21,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:43:23,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:43:23,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 18:43:23,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:43:23,795 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:43:26,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:26,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:43:26,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:28,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:43:28,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 18:43:29,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:43:29,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:33,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:43:37,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=107773.33333333333, ans=0.125 2023-09-28 18:43:43,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:48,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=107840.0, ans=0.0 2023-09-28 18:43:51,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:53,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:43:57,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:01,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 18:44:01,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:01,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:44:01,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:01,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=107840.0, ans=0.0 2023-09-28 18:44:02,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:44:04,273 INFO [train.py:1039] (0/4) Epoch 4, batch 250, loss[loss=0.2826, simple_loss=0.347, pruned_loss=0.109, over 24051.00 frames. ], tot_loss[loss=0.2744, simple_loss=0.3271, pruned_loss=0.1108, over 3385820.55 frames. ], batch size: 80, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:44:04,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 18:44:04,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:44:04,565 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 18:44:07,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:11,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:44:13,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:13,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:14,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:44:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:16,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:44:20,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:44:34,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:36,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:38,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:44:41,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=108040.0, ans=0.125 2023-09-28 18:44:45,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:44:45,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:44:47,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:44:47,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:47,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:44:47,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:44:47,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:51,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:44:52,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 18:44:52,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:54,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:44:56,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:44:56,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:44:57,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:44:57,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:44:59,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:45:00,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:02,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:45:03,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:05,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=108106.66666666667, ans=0.125 2023-09-28 18:45:08,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:45:11,359 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.23 vs. limit=22.5 2023-09-28 18:45:12,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:13,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:45:17,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:19,088 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.821e+02 2.378e+02 2.704e+02 3.177e+02 4.711e+02, threshold=5.407e+02, percent-clipped=0.0 2023-09-28 18:45:19,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:45:23,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 18:45:25,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:45:25,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:45:25,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 18:45:25,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:45:26,858 INFO [train.py:1039] (0/4) Epoch 4, batch 300, loss[loss=0.2947, simple_loss=0.341, pruned_loss=0.1242, over 23535.00 frames. ], tot_loss[loss=0.2721, simple_loss=0.324, pruned_loss=0.1101, over 3674203.63 frames. ], batch size: 106, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:45:27,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:45:27,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 18:45:32,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:32,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:45:38,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:45:38,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 18:45:40,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:42,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:45:42,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 18:45:42,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:45:44,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=108306.66666666667, ans=0.125 2023-09-28 18:45:47,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:45:52,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:45:52,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 18:45:57,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 18:45:58,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:00,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:01,155 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.70 vs. limit=6.0 2023-09-28 18:46:01,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:01,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 18:46:01,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:46:04,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:46:07,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:46:07,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:11,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:46:11,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 18:46:13,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:46:16,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:18,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 18:46:18,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:23,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:46:27,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:46:27,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 18:46:27,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=108440.0, ans=0.09899494936611666 2023-09-28 18:46:30,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:30,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:46:33,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:35,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:46:35,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 18:46:35,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:46:38,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:39,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 18:46:41,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:41,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:43,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:44,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:44,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:49,314 INFO [train.py:1039] (0/4) Epoch 4, batch 350, loss[loss=0.2857, simple_loss=0.3233, pruned_loss=0.124, over 23972.00 frames. ], tot_loss[loss=0.2701, simple_loss=0.3217, pruned_loss=0.1092, over 3905357.39 frames. ], batch size: 196, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:46:49,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:46:49,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:46:53,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:01,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:47:01,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=108573.33333333333, ans=0.0 2023-09-28 18:47:04,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:04,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:07,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 18:47:09,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:10,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 18:47:12,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:12,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 18:47:13,529 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.40 vs. limit=15.0 2023-09-28 18:47:14,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:17,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 18:47:18,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:47:21,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:22,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:47:24,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:24,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:25,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:47:26,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:47:26,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:34,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=108706.66666666667, ans=0.125 2023-09-28 18:47:35,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:47:36,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:47:36,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:47:36,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:41,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 18:47:41,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:46,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:47,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:47:47,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:49,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 18:47:50,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:47:52,360 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 18:47:53,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 18:47:53,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:57,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:57,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 18:48:00,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:03,916 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.316e+02 2.681e+02 3.192e+02 4.934e+02, threshold=5.363e+02, percent-clipped=0.0 2023-09-28 18:48:04,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:48:06,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:07,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:07,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:09,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:12,986 INFO [train.py:1039] (0/4) Epoch 4, batch 400, loss[loss=0.2456, simple_loss=0.3112, pruned_loss=0.08996, over 24422.00 frames. ], tot_loss[loss=0.2698, simple_loss=0.3219, pruned_loss=0.1089, over 4081569.38 frames. ], batch size: 69, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:48:13,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:48:16,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:48:16,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 18:48:17,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:17,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:20,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:48:21,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:22,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:25,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:28,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 18:48:30,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 18:48:30,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:32,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 18:48:32,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:35,060 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.94 vs. limit=22.5 2023-09-28 18:48:38,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:48:38,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:39,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 18:48:39,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:48:40,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:40,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:40,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:43,240 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 18:48:44,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 18:48:50,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:51,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:51,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 18:48:53,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 18:48:56,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:48:59,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:01,054 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=109106.66666666667, ans=0.2 2023-09-28 18:49:05,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 18:49:09,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:49:10,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 18:49:11,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=109106.66666666667, ans=0.1 2023-09-28 18:49:14,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:49:14,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:49:15,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 18:49:19,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:49:20,711 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.48 vs. limit=8.0 2023-09-28 18:49:23,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:49:23,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:49:26,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:26,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 18:49:26,888 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=109173.33333333333, ans=0.125 2023-09-28 18:49:28,486 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.94 vs. limit=22.5 2023-09-28 18:49:29,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:49:29,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=109173.33333333333, ans=0.125 2023-09-28 18:49:30,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 18:49:33,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:49:34,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:49:34,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 18:49:34,881 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.42 vs. limit=15.0 2023-09-28 18:49:35,756 INFO [train.py:1039] (0/4) Epoch 4, batch 450, loss[loss=0.2684, simple_loss=0.3351, pruned_loss=0.1009, over 24302.00 frames. ], tot_loss[loss=0.2713, simple_loss=0.3229, pruned_loss=0.1098, over 4219729.65 frames. ], batch size: 74, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:49:36,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:49:37,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:49:37,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:49:39,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 18:49:40,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:49:40,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:49:40,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:49:41,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=109240.0, ans=0.125 2023-09-28 18:49:41,195 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:49:43,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 18:49:43,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:49:44,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:49:46,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:49:55,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:57,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:49:59,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 18:49:59,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 18:50:03,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:50:06,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:08,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:11,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:12,228 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.59 vs. limit=15.0 2023-09-28 18:50:12,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:16,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 18:50:18,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 18:50:20,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 18:50:21,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:50:23,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:23,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:50:25,644 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 18:50:25,658 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 18:50:25,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:25,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=109440.0, ans=0.0 2023-09-28 18:50:27,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:50:27,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:50:27,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=109440.0, ans=0.2 2023-09-28 18:50:31,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:50:31,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:50:31,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 18:50:31,506 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=109440.0, ans=0.125 2023-09-28 18:50:32,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 18:50:35,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:37,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:50:37,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:50:38,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 18:50:43,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:50:45,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 18:50:45,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 18:50:46,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:51,107 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.294e+02 2.615e+02 3.130e+02 6.732e+02, threshold=5.230e+02, percent-clipped=1.0 2023-09-28 18:50:53,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:50:55,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:50:58,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:50:58,547 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 18:50:59,901 INFO [train.py:1039] (0/4) Epoch 4, batch 500, loss[loss=0.2541, simple_loss=0.3167, pruned_loss=0.0958, over 23245.00 frames. ], tot_loss[loss=0.2733, simple_loss=0.3241, pruned_loss=0.1112, over 4301947.73 frames. ], batch size: 93, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:51:02,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:04,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:51:04,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:04,285 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 18:51:07,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 18:51:07,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:10,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:51:10,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=109573.33333333333, ans=0.1 2023-09-28 18:51:13,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=109573.33333333333, ans=0.2 2023-09-28 18:51:15,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:51:17,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:51:18,758 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.67 vs. limit=22.5 2023-09-28 18:51:19,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:51:19,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:19,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:20,123 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=109640.0, ans=0.125 2023-09-28 18:51:31,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:31,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:51:31,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:51:33,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:33,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 18:51:33,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:51:36,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:51:36,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:51:38,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:51:38,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:40,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 18:51:43,356 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 18:51:44,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:51:46,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=109706.66666666667, ans=0.1 2023-09-28 18:51:47,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:49,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:51:51,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 18:51:55,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:51:55,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:01,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:03,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:52:09,498 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.63 vs. limit=15.0 2023-09-28 18:52:10,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:12,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 18:52:14,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:14,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:17,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 18:52:19,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:52:20,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:22,115 INFO [train.py:1039] (0/4) Epoch 4, batch 550, loss[loss=0.2831, simple_loss=0.3198, pruned_loss=0.1233, over 23648.00 frames. ], tot_loss[loss=0.2758, simple_loss=0.326, pruned_loss=0.1129, over 4392170.62 frames. ], batch size: 212, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:52:25,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 18:52:26,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 18:52:26,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:26,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 18:52:28,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:52:28,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:28,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:28,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:29,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:52:29,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:52:32,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:34,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 18:52:34,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:52:38,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:52:38,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:40,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:52:43,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:47,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 18:52:47,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=109973.33333333333, ans=0.125 2023-09-28 18:52:48,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 18:52:51,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:52:58,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:52:58,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:52:59,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:53:04,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:04,269 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 18:53:04,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:53:05,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 18:53:09,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=110106.66666666667, ans=0.125 2023-09-28 18:53:10,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:53:11,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:53:11,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:53:11,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:11,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=110106.66666666667, ans=0.2 2023-09-28 18:53:14,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 18:53:15,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 18:53:16,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:16,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:53:16,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:53:16,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:53:16,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=110106.66666666667, ans=0.0 2023-09-28 18:53:21,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:53:22,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:53:25,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:53:25,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:28,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:53:28,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:53:28,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=110173.33333333333, ans=0.125 2023-09-28 18:53:29,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:31,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:53:31,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:32,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:53:32,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:53:35,851 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.907e+02 2.501e+02 3.093e+02 3.785e+02 7.626e+02, threshold=6.186e+02, percent-clipped=7.0 2023-09-28 18:53:37,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 18:53:43,634 INFO [train.py:1039] (0/4) Epoch 4, batch 600, loss[loss=0.2956, simple_loss=0.3469, pruned_loss=0.1222, over 23418.00 frames. ], tot_loss[loss=0.2753, simple_loss=0.3258, pruned_loss=0.1124, over 4464708.47 frames. ], batch size: 93, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:53:43,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 18:53:43,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:53:45,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:53:45,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:53,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:53:53,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=110240.0, ans=0.125 2023-09-28 18:53:55,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:53:57,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 18:53:58,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:54:01,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:04,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:06,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 18:54:06,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:54:12,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 18:54:18,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:54:18,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:18,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:54:18,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=110373.33333333333, ans=0.0 2023-09-28 18:54:25,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:54:25,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:54:26,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:33,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:54:38,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:38,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:38,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:39,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=110440.0, ans=0.1 2023-09-28 18:54:44,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 18:54:45,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=110440.0, ans=0.125 2023-09-28 18:54:50,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:54:50,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:54:54,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 18:54:56,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:55:00,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 18:55:00,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:55:00,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:55:01,036 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.61 vs. limit=22.5 2023-09-28 18:55:06,792 INFO [train.py:1039] (0/4) Epoch 4, batch 650, loss[loss=0.2729, simple_loss=0.3026, pruned_loss=0.1216, over 23637.00 frames. ], tot_loss[loss=0.2729, simple_loss=0.3238, pruned_loss=0.1111, over 4519923.67 frames. ], batch size: 256, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:55:06,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:55:07,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:55:09,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:12,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:55:15,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:17,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 18:55:17,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:55:23,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:55:23,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:29,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:29,392 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=110640.0, ans=0.1 2023-09-28 18:55:30,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 18:55:30,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=110640.0, ans=0.125 2023-09-28 18:55:34,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:55:34,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:38,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:55:38,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 18:55:38,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=110706.66666666667, ans=0.125 2023-09-28 18:55:41,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:42,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:42,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:55:43,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:44,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:55:46,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:55:46,248 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 18:55:46,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:46,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:55:49,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:50,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:52,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:55:52,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:55:54,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 18:55:54,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:55:55,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:57,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:55:57,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:57,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=110773.33333333333, ans=0.1 2023-09-28 18:55:59,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:56:02,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 18:56:03,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 18:56:05,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:05,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:56:05,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:56:05,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:56:08,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:56:12,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=110840.0, ans=0.125 2023-09-28 18:56:15,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:15,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:15,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:56:18,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:18,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 18:56:20,136 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.763e+02 2.387e+02 2.700e+02 3.231e+02 6.128e+02, threshold=5.400e+02, percent-clipped=0.0 2023-09-28 18:56:20,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:26,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:56:26,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:27,734 INFO [train.py:1039] (0/4) Epoch 4, batch 700, loss[loss=0.2888, simple_loss=0.3272, pruned_loss=0.1252, over 23843.00 frames. ], tot_loss[loss=0.2714, simple_loss=0.3225, pruned_loss=0.1102, over 4567317.02 frames. ], batch size: 179, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:56:27,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:27,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:32,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 18:56:34,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 18:56:36,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 18:56:36,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:38,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=110906.66666666667, ans=0.125 2023-09-28 18:56:39,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:56:41,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 18:56:45,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:48,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:56:48,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:50,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:56:51,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:53,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:56,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:56:56,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:56:57,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=110973.33333333333, ans=0.2 2023-09-28 18:56:58,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 18:56:59,032 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.49 vs. limit=15.0 2023-09-28 18:57:00,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 18:57:00,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=111040.0, ans=0.125 2023-09-28 18:57:05,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:57:05,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:57:08,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:57:11,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:57:13,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 18:57:15,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=111040.0, ans=0.0 2023-09-28 18:57:15,901 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.28 vs. limit=15.0 2023-09-28 18:57:18,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:19,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:57:19,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 18:57:24,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:57:26,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:30,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:57:36,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:57:36,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 18:57:36,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=111173.33333333333, ans=0.125 2023-09-28 18:57:40,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 18:57:40,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 18:57:43,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:45,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:57:46,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:57:50,923 INFO [train.py:1039] (0/4) Epoch 4, batch 750, loss[loss=0.301, simple_loss=0.3508, pruned_loss=0.1256, over 23521.00 frames. ], tot_loss[loss=0.2703, simple_loss=0.3215, pruned_loss=0.1095, over 4600939.51 frames. ], batch size: 93, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:57:51,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:51,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 18:57:54,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 18:57:54,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 18:57:56,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 18:57:57,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 18:57:57,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 18:57:57,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:57:59,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 18:57:59,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:58:01,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:02,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:04,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:05,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:58:05,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:06,036 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=111306.66666666667, ans=0.125 2023-09-28 18:58:07,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:58:08,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:58:10,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:58:12,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:14,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:14,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 18:58:15,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:58:17,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:18,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:20,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:58:21,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 18:58:21,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:58:23,203 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.73 vs. limit=12.0 2023-09-28 18:58:25,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 18:58:25,618 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 18:58:27,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 18:58:27,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:58:27,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:58:28,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:58:33,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=111373.33333333333, ans=0.125 2023-09-28 18:58:34,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:36,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:36,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:58:38,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:39,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:39,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 18:58:41,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:58:44,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:58:44,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:58:47,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:58:47,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 18:58:48,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:54,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:58:55,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:58:55,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:59,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:59:03,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 18:59:04,529 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.995e+02 2.483e+02 2.790e+02 3.186e+02 5.320e+02, threshold=5.579e+02, percent-clipped=0.0 2023-09-28 18:59:04,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:04,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:09,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:09,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:12,121 INFO [train.py:1039] (0/4) Epoch 4, batch 800, loss[loss=0.2708, simple_loss=0.3352, pruned_loss=0.1032, over 24645.00 frames. ], tot_loss[loss=0.2702, simple_loss=0.3225, pruned_loss=0.109, over 4637678.58 frames. ], batch size: 68, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:59:12,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:12,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:59:20,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:20,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:23,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:23,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:23,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:24,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:24,924 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.41 vs. limit=15.0 2023-09-28 18:59:26,093 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.29 vs. limit=15.0 2023-09-28 18:59:26,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:31,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:31,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:59:36,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 18:59:37,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:38,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:38,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:59:39,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:39,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 18:59:39,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:39,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 18:59:39,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=111640.0, ans=0.125 2023-09-28 18:59:42,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:43,241 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.36 vs. limit=15.0 2023-09-28 18:59:45,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:46,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:46,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:50,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:50,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:56,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:59:56,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:59:56,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:59:58,470 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 18:59:58,738 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=111706.66666666667, ans=0.125 2023-09-28 18:59:59,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 18:59:59,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:59:59,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:01,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:03,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:08,250 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 19:00:08,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 19:00:11,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:00:13,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:00:18,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:00:21,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:23,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 19:00:23,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:00:26,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 19:00:34,263 INFO [train.py:1039] (0/4) Epoch 4, batch 850, loss[loss=0.272, simple_loss=0.3176, pruned_loss=0.1132, over 23594.00 frames. ], tot_loss[loss=0.2723, simple_loss=0.3243, pruned_loss=0.1102, over 4644511.35 frames. ], batch size: 149, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:00:34,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:36,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:00:36,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 19:00:36,355 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=111906.66666666667, ans=0.2 2023-09-28 19:00:37,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:00:37,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:39,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 19:00:40,095 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.46 vs. limit=15.0 2023-09-28 19:00:40,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:40,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:00:42,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:44,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:00:44,888 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=111906.66666666667, ans=0.0 2023-09-28 19:00:44,996 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=111906.66666666667, ans=0.125 2023-09-28 19:00:46,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:48,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 19:00:48,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 19:00:48,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 19:00:49,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:49,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:00:50,603 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.26 vs. limit=15.0 2023-09-28 19:00:51,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=111973.33333333333, ans=0.1 2023-09-28 19:00:52,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:52,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:52,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:00:58,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:59,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:59,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 19:01:01,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 19:01:04,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:01:06,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 19:01:10,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 19:01:12,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 19:01:14,065 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 19:01:16,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:16,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:01:16,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:01:18,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:21,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:21,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 19:01:24,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:24,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:25,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:01:25,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:01:26,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=112106.66666666667, ans=0.125 2023-09-28 19:01:27,871 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=112106.66666666667, ans=0.125 2023-09-28 19:01:28,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:01:30,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:01:30,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 19:01:35,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:01:35,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:35,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:01:35,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:37,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:38,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:41,094 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.28 vs. limit=15.0 2023-09-28 19:01:42,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:01:43,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:01:45,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:01:46,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:01:47,643 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=8.96 vs. limit=15.0 2023-09-28 19:01:47,939 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.846e+02 2.332e+02 2.611e+02 3.097e+02 5.192e+02, threshold=5.223e+02, percent-clipped=0.0 2023-09-28 19:01:52,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:01:54,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:56,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 19:01:56,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:01:56,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:57,610 INFO [train.py:1039] (0/4) Epoch 4, batch 900, loss[loss=0.3963, simple_loss=0.3992, pruned_loss=0.1966, over 19906.00 frames. ], tot_loss[loss=0.2741, simple_loss=0.3259, pruned_loss=0.1112, over 4655020.64 frames. ], batch size: 388, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:01:59,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 19:02:01,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=112240.0, ans=0.125 2023-09-28 19:02:06,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:02:09,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:09,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 19:02:13,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:02:15,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 19:02:15,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:02:16,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:02:16,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:16,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:02:18,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:02:18,416 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=112306.66666666667, ans=0.0 2023-09-28 19:02:29,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:02:29,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:29,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:02:29,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=112373.33333333333, ans=0.0 2023-09-28 19:02:33,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:38,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 19:02:40,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:02:44,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:02:44,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=112440.0, ans=0.2 2023-09-28 19:02:46,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:02:46,784 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 19:02:48,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 19:02:48,626 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=112440.0, ans=0.0 2023-09-28 19:02:54,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:02:54,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:02:55,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:03:01,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:01,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:05,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 19:03:05,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:03:08,315 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.81 vs. limit=22.5 2023-09-28 19:03:08,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 19:03:10,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:03:10,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:10,944 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.41 vs. limit=15.0 2023-09-28 19:03:13,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:03:13,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:16,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 19:03:16,635 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 19:03:18,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:03:18,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 19:03:20,103 INFO [train.py:1039] (0/4) Epoch 4, batch 950, loss[loss=0.2681, simple_loss=0.3316, pruned_loss=0.1023, over 24556.00 frames. ], tot_loss[loss=0.2731, simple_loss=0.3253, pruned_loss=0.1104, over 4678651.94 frames. ], batch size: 71, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:03:21,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:26,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 19:03:28,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=112573.33333333333, ans=0.0 2023-09-28 19:03:29,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:33,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:33,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:34,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:03:36,474 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 19:03:41,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:41,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:03:41,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:43,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:03:43,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 19:03:44,050 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=112640.0, ans=0.1 2023-09-28 19:03:45,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:03:46,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:47,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 19:03:48,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:53,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:55,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 19:03:56,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:03:59,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:04:00,845 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.51 vs. limit=15.0 2023-09-28 19:04:01,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:04:05,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:06,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:04:08,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 19:04:10,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:04:10,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:04:12,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:12,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:12,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:04:17,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 19:04:18,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:04:20,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=112773.33333333333, ans=0.125 2023-09-28 19:04:21,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:23,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:23,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 19:04:24,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:24,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:04:25,053 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:04:26,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 19:04:29,604 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.96 vs. limit=12.0 2023-09-28 19:04:30,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:04:33,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:34,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.515e+02 2.858e+02 3.350e+02 4.786e+02, threshold=5.716e+02, percent-clipped=0.0 2023-09-28 19:04:35,989 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.77 vs. limit=22.5 2023-09-28 19:04:37,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:38,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 19:04:38,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 19:04:42,976 INFO [train.py:1039] (0/4) Epoch 4, batch 1000, loss[loss=0.2569, simple_loss=0.2818, pruned_loss=0.116, over 19347.00 frames. ], tot_loss[loss=0.2711, simple_loss=0.3234, pruned_loss=0.1094, over 4691577.19 frames. ], batch size: 388, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:04:43,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:46,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 19:04:46,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:51,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:04:53,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 19:04:53,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 19:04:59,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:04:59,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:59,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:04,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 19:05:08,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 19:05:10,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 19:05:10,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:11,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 19:05:13,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 19:05:14,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 19:05:16,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:17,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:21,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=113040.0, ans=0.2 2023-09-28 19:05:27,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:27,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:05:28,994 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=17.43 vs. limit=22.5 2023-09-28 19:05:29,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:30,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:30,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 19:05:30,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:31,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:05:32,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 19:05:36,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 19:05:37,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 19:05:39,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 19:05:40,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:05:47,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:47,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:05:47,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:49,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:05:50,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 19:05:52,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:05:53,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 19:05:53,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 19:05:55,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:05:55,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:59,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:06:02,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:06:02,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=113173.33333333333, ans=0.125 2023-09-28 19:06:04,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:06,319 INFO [train.py:1039] (0/4) Epoch 4, batch 1050, loss[loss=0.2851, simple_loss=0.3328, pruned_loss=0.1187, over 23584.00 frames. ], tot_loss[loss=0.27, simple_loss=0.3225, pruned_loss=0.1088, over 4690932.47 frames. ], batch size: 134, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:06:09,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:06:11,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:06:12,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:06:14,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:15,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:16,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:06:18,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:06:21,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:06:22,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:06:22,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:06:24,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:06:24,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 19:06:26,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:26,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 19:06:29,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:06:29,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 19:06:29,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:06:38,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:38,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:06:40,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:41,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 19:06:41,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 19:06:42,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:45,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 19:06:48,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 19:06:48,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:51,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:06:54,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:06:55,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:06:55,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:06:58,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:07:02,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 19:07:03,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 19:07:03,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 19:07:05,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:05,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:07:07,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 19:07:13,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:07:15,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:15,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:16,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:16,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 19:07:21,387 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.899e+02 2.368e+02 2.685e+02 3.530e+02 6.169e+02, threshold=5.370e+02, percent-clipped=1.0 2023-09-28 19:07:23,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:23,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 19:07:23,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 19:07:24,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:07:28,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:07:29,810 INFO [train.py:1039] (0/4) Epoch 4, batch 1100, loss[loss=0.2962, simple_loss=0.339, pruned_loss=0.1267, over 23881.00 frames. ], tot_loss[loss=0.2687, simple_loss=0.3204, pruned_loss=0.1085, over 4679216.89 frames. ], batch size: 195, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:07:34,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:07:38,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:07:38,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:07:40,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:40,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 19:07:43,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:07:43,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=113573.33333333333, ans=0.0 2023-09-28 19:07:45,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:07:47,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:07:51,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:07:51,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 19:07:53,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:07:54,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:56,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:57,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:08:00,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:08:01,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=113706.66666666667, ans=0.1 2023-09-28 19:08:03,129 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.34 vs. limit=15.0 2023-09-28 19:08:04,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:08:06,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 19:08:08,005 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 19:08:08,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=113706.66666666667, ans=0.2 2023-09-28 19:08:09,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:11,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:12,250 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.03 vs. limit=22.5 2023-09-28 19:08:12,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:08:15,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:08:16,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 19:08:16,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:08:16,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:08:16,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:08:16,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:16,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 19:08:23,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:08:23,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 19:08:26,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:08:30,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:08:34,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 19:08:34,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:08:37,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:38,126 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=113840.0, ans=0.0 2023-09-28 19:08:40,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:08:40,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:42,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 19:08:43,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:08:43,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:44,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 19:08:44,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:08:45,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 19:08:49,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:08:49,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:08:51,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:08:52,502 INFO [train.py:1039] (0/4) Epoch 4, batch 1150, loss[loss=0.2841, simple_loss=0.3215, pruned_loss=0.1233, over 22739.00 frames. ], tot_loss[loss=0.2705, simple_loss=0.3221, pruned_loss=0.1094, over 4696766.83 frames. ], batch size: 322, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:08:54,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=113906.66666666667, ans=0.125 2023-09-28 19:08:57,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:08:59,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:09:01,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:01,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:09:02,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 19:09:02,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:05,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 19:09:05,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:05,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:09:05,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=113906.66666666667, ans=0.0 2023-09-28 19:09:09,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=113973.33333333333, ans=0.2 2023-09-28 19:09:12,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 19:09:14,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:15,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=113973.33333333333, ans=0.125 2023-09-28 19:09:20,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:20,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:20,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 19:09:20,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:09:20,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:24,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 19:09:26,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:27,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:37,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=114040.0, ans=0.0 2023-09-28 19:09:38,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:47,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:47,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 19:09:47,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:48,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:53,369 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 19:09:56,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:10:03,240 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 19:10:06,254 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.334e+02 2.773e+02 3.498e+02 6.141e+02, threshold=5.547e+02, percent-clipped=2.0 2023-09-28 19:10:06,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:08,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:10:08,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:10:08,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:10:11,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:14,773 INFO [train.py:1039] (0/4) Epoch 4, batch 1200, loss[loss=0.2901, simple_loss=0.3307, pruned_loss=0.1248, over 23428.00 frames. ], tot_loss[loss=0.2707, simple_loss=0.3227, pruned_loss=0.1094, over 4704477.52 frames. ], batch size: 285, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:10:15,447 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.09 vs. limit=15.0 2023-09-28 19:10:17,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:10:17,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:10:19,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:19,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:19,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:10:19,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=114240.0, ans=0.1 2023-09-28 19:10:23,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:10:24,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:10:26,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:26,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:10:26,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=114240.0, ans=0.1 2023-09-28 19:10:29,340 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 19:10:32,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 19:10:37,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:10:37,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=114306.66666666667, ans=0.125 2023-09-28 19:10:37,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=114306.66666666667, ans=0.0 2023-09-28 19:10:39,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:10:42,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:44,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:10:44,580 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 19:10:44,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:53,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:10:53,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:10:53,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 19:10:55,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:10:59,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 19:11:02,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 19:11:02,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:11:05,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:11:06,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:06,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:11:08,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:11:08,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:11:08,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:11:10,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 19:11:10,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:11:11,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=114440.0, ans=0.0 2023-09-28 19:11:12,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:12,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:11:15,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:15,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:20,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:11:22,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:11:23,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 19:11:27,134 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 19:11:29,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:11:32,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:34,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:11:35,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:37,078 INFO [train.py:1039] (0/4) Epoch 4, batch 1250, loss[loss=0.2605, simple_loss=0.3094, pruned_loss=0.1058, over 20329.00 frames. ], tot_loss[loss=0.2713, simple_loss=0.3234, pruned_loss=0.1095, over 4706574.90 frames. ], batch size: 44, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:11:38,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 19:11:42,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:11:44,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:44,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 19:11:46,799 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.98 vs. limit=22.5 2023-09-28 19:11:48,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:11:49,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:11:50,592 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.97 vs. limit=15.0 2023-09-28 19:11:54,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:11:54,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:55,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:11:55,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:11:59,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:12:04,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:12:04,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:12:04,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:04,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=114640.0, ans=0.0 2023-09-28 19:12:06,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:12:06,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:09,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:10,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:12:11,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=114706.66666666667, ans=0.2 2023-09-28 19:12:13,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 19:12:15,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:12:18,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=114706.66666666667, ans=0.1 2023-09-28 19:12:19,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:19,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 19:12:21,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:12:21,438 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 19:12:21,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:21,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:24,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:27,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:29,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:12:30,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 19:12:30,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 19:12:32,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 19:12:34,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:12:34,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=114773.33333333333, ans=0.2 2023-09-28 19:12:36,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 19:12:37,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:40,121 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=12.0 2023-09-28 19:12:40,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:12:40,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:12:42,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 19:12:42,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:12:43,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:12:43,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:12:45,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:47,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=114840.0, ans=0.125 2023-09-28 19:12:48,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 19:12:49,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:51,346 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.852e+02 2.462e+02 2.704e+02 3.277e+02 4.911e+02, threshold=5.408e+02, percent-clipped=0.0 2023-09-28 19:12:51,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:12:53,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:12:57,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:13:00,820 INFO [train.py:1039] (0/4) Epoch 4, batch 1300, loss[loss=0.2426, simple_loss=0.3024, pruned_loss=0.09144, over 24327.00 frames. ], tot_loss[loss=0.2716, simple_loss=0.3242, pruned_loss=0.1095, over 4707808.39 frames. ], batch size: 61, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:13:01,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:13:02,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 19:13:04,437 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:13:05,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:07,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:13:08,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:12,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:13:13,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:13:13,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 19:13:20,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:13:20,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:13:20,817 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-09-28 19:13:21,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 19:13:26,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:13:30,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:30,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:32,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:34,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:35,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:13:37,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:13:37,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 19:13:39,941 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.10 vs. limit=22.5 2023-09-28 19:13:43,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:13:43,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:13:45,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 19:13:47,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:13:48,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:13:51,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:53,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 19:13:53,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:53,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 19:13:53,990 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.95 vs. limit=15.0 2023-09-28 19:13:54,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:55,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=115106.66666666667, ans=0.125 2023-09-28 19:13:59,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:59,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:14:04,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 19:14:04,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 19:14:06,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 19:14:11,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:14:13,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 19:14:14,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:22,651 INFO [train.py:1039] (0/4) Epoch 4, batch 1350, loss[loss=0.2798, simple_loss=0.2988, pruned_loss=0.1304, over 19231.00 frames. ], tot_loss[loss=0.2702, simple_loss=0.3219, pruned_loss=0.1092, over 4704116.86 frames. ], batch size: 388, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:14:24,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 19:14:28,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:30,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:33,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:33,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:34,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=115240.0, ans=0.1 2023-09-28 19:14:35,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:14:35,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:43,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:44,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 19:14:44,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:14:46,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:14:48,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 19:14:49,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:14:51,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:14:51,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 19:14:52,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 19:14:54,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 19:14:54,733 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:14:56,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:56,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 19:14:57,000 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.87 vs. limit=15.0 2023-09-28 19:15:07,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:12,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=115440.0, ans=0.125 2023-09-28 19:15:16,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:18,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:18,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 19:15:20,306 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=115440.0, ans=0.125 2023-09-28 19:15:21,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:21,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 19:15:21,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:15:23,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:15:26,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:15:29,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 19:15:31,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:15:36,830 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.262e+02 2.530e+02 3.010e+02 4.866e+02, threshold=5.060e+02, percent-clipped=0.0 2023-09-28 19:15:38,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 19:15:40,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 19:15:45,548 INFO [train.py:1039] (0/4) Epoch 4, batch 1400, loss[loss=0.2451, simple_loss=0.313, pruned_loss=0.0886, over 24424.00 frames. ], tot_loss[loss=0.2673, simple_loss=0.3197, pruned_loss=0.1074, over 4698096.07 frames. ], batch size: 63, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:15:45,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 19:15:48,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:52,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:15:52,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:15:57,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 19:16:00,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 19:16:08,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:16:09,683 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=15.0 2023-09-28 19:16:10,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:10,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff2.min_abs, batch_count=115640.0, ans=0.1 2023-09-28 19:16:13,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:16:13,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:16:15,460 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:16:17,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:16:17,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=115706.66666666667, ans=0.05 2023-09-28 19:16:20,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:16:28,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:29,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=115706.66666666667, ans=0.125 2023-09-28 19:16:30,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:35,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 19:16:37,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:16:37,425 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=115773.33333333333, ans=0.0 2023-09-28 19:16:38,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:16:38,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:16:38,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:39,115 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=115773.33333333333, ans=0.125 2023-09-28 19:16:40,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:16:40,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:16:40,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:16:43,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 19:16:43,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:16:49,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:52,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:16:53,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=115840.0, ans=0.125 2023-09-28 19:16:53,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.93 vs. limit=15.0 2023-09-28 19:17:00,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 19:17:01,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:17:03,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:17:06,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 19:17:06,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:07,665 INFO [train.py:1039] (0/4) Epoch 4, batch 1450, loss[loss=0.2645, simple_loss=0.3246, pruned_loss=0.1021, over 24488.00 frames. ], tot_loss[loss=0.2666, simple_loss=0.3199, pruned_loss=0.1067, over 4707544.11 frames. ], batch size: 66, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:17:07,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:17:12,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:17:15,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:17:15,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:15,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:17:20,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:20,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:17:22,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:17:22,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 19:17:22,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:17:23,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 19:17:24,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:24,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=115973.33333333333, ans=0.1 2023-09-28 19:17:25,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:25,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 19:17:27,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:29,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:17:29,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 19:17:31,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:31,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:17:32,576 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=15.0 2023-09-28 19:17:33,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:34,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:38,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:17:38,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:17:39,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:41,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:42,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:43,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:17:43,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:45,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:17:49,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 19:17:51,961 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.05 vs. limit=10.0 2023-09-28 19:17:52,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:54,339 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 19:17:55,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:17:59,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:17:59,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:00,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 19:18:06,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:06,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 19:18:08,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 19:18:09,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:12,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:14,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:18:15,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 19:18:17,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 19:18:17,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 19:18:17,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=116173.33333333333, ans=0.125 2023-09-28 19:18:20,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:22,089 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.906e+02 2.300e+02 2.689e+02 3.268e+02 5.170e+02, threshold=5.379e+02, percent-clipped=2.0 2023-09-28 19:18:22,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:18:25,801 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.43 vs. limit=15.0 2023-09-28 19:18:29,004 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-09-28 19:18:29,427 INFO [train.py:1039] (0/4) Epoch 4, batch 1500, loss[loss=0.2255, simple_loss=0.2884, pruned_loss=0.08129, over 24432.00 frames. ], tot_loss[loss=0.2687, simple_loss=0.3211, pruned_loss=0.1081, over 4698113.19 frames. ], batch size: 58, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:18:34,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 19:18:34,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:18:34,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:18:36,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:38,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:38,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:18:40,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 19:18:41,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:18:42,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:18:42,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:43,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:44,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:18:47,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 19:18:54,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:18:54,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:18:56,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:57,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 19:19:02,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 19:19:04,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:19:04,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 19:19:05,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:19:09,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:09,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:19:09,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:19:11,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 19:19:11,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:19:11,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=116373.33333333333, ans=0.125 2023-09-28 19:19:13,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:13,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 19:19:13,628 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:19:15,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:15,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=116373.33333333333, ans=0.0 2023-09-28 19:19:21,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:19:21,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 19:19:28,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:19:29,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:19:34,035 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 19:19:35,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:35,500 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 19:19:37,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:19:37,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:19:38,762 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 19:19:40,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:19:41,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 19:19:44,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:47,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:48,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:49,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:49,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:51,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:51,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 19:19:52,780 INFO [train.py:1039] (0/4) Epoch 4, batch 1550, loss[loss=0.2391, simple_loss=0.3038, pruned_loss=0.08717, over 24471.00 frames. ], tot_loss[loss=0.2684, simple_loss=0.3212, pruned_loss=0.1078, over 4706616.64 frames. ], batch size: 63, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:19:52,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 19:19:52,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:19:53,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 19:19:54,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 19:19:54,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=116573.33333333333, ans=0.2 2023-09-28 19:19:58,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:00,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:00,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:00,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:20:02,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:03,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:07,075 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 19:20:07,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:07,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:20:07,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:20:10,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:20:10,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 19:20:11,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:11,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 19:20:13,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 19:20:13,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 19:20:13,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:15,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:20,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:20:22,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 19:20:22,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 19:20:24,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=116706.66666666667, ans=0.2 2023-09-28 19:20:31,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=116706.66666666667, ans=0.0 2023-09-28 19:20:32,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:36,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:36,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:20:36,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:20:36,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 19:20:36,737 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:20:41,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:20:42,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:45,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:20:48,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:20:48,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:48,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 19:20:50,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:20:51,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:20:52,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:53,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:20:53,993 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 19:20:56,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:01,355 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.63 vs. limit=6.0 2023-09-28 19:21:02,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 19:21:06,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:07,903 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.992e+02 2.334e+02 2.588e+02 3.124e+02 7.530e+02, threshold=5.176e+02, percent-clipped=1.0 2023-09-28 19:21:08,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:21:09,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 19:21:11,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:21:11,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=116840.0, ans=0.1 2023-09-28 19:21:12,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:12,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:21:12,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:21:14,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:21:15,412 INFO [train.py:1039] (0/4) Epoch 4, batch 1600, loss[loss=0.2726, simple_loss=0.3295, pruned_loss=0.1079, over 23230.00 frames. ], tot_loss[loss=0.2684, simple_loss=0.3218, pruned_loss=0.1075, over 4698226.61 frames. ], batch size: 105, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:21:18,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:18,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 19:21:20,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 19:21:22,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 19:21:24,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:21:26,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 19:21:28,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:21:30,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:21:34,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:21:40,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 19:21:43,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:21:43,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 19:21:43,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:45,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 19:21:48,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=117040.0, ans=0.125 2023-09-28 19:21:49,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 19:21:57,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:01,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 19:22:02,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:02,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:02,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:22:05,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 19:22:09,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:22:11,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:22:12,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:12,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:12,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:22:16,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:22:18,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:22:19,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:22:24,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:25,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:22:29,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 19:22:29,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:22:29,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 19:22:34,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:37,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:22:37,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:22:39,568 INFO [train.py:1039] (0/4) Epoch 4, batch 1650, loss[loss=0.2889, simple_loss=0.3357, pruned_loss=0.1211, over 23123.00 frames. ], tot_loss[loss=0.2677, simple_loss=0.3218, pruned_loss=0.1069, over 4713743.75 frames. ], batch size: 105, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:22:39,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 19:22:39,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 19:22:39,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 19:22:39,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 19:22:44,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:45,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:45,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:22:47,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:22:49,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=117240.0, ans=0.125 2023-09-28 19:22:50,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:53,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 19:22:54,179 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=117306.66666666667, ans=10.0 2023-09-28 19:22:56,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:56,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=117306.66666666667, ans=0.0 2023-09-28 19:22:57,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:57,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:22:57,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:22:57,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 19:22:57,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 19:22:57,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=117306.66666666667, ans=0.125 2023-09-28 19:23:03,283 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=117306.66666666667, ans=0.125 2023-09-28 19:23:04,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:23:07,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:23:07,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=117306.66666666667, ans=0.2 2023-09-28 19:23:18,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 19:23:19,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:20,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 19:23:22,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=117373.33333333333, ans=0.125 2023-09-28 19:23:23,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:26,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:23:26,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:23:26,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:26,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=117373.33333333333, ans=0.125 2023-09-28 19:23:27,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:23:29,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:31,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:23:31,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:32,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:32,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:32,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:33,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:23:36,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:36,551 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=117440.0, ans=0.0 2023-09-28 19:23:37,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 19:23:39,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:39,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 19:23:42,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 19:23:42,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 19:23:42,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:43,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:23:45,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:45,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:45,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 19:23:50,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:53,773 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.422e+02 2.762e+02 3.244e+02 4.441e+02, threshold=5.524e+02, percent-clipped=0.0 2023-09-28 19:23:53,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:23:53,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:55,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 19:24:00,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:24:00,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:24:01,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 19:24:01,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:01,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:24:01,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:02,348 INFO [train.py:1039] (0/4) Epoch 4, batch 1700, loss[loss=0.2749, simple_loss=0.3427, pruned_loss=0.1035, over 24362.00 frames. ], tot_loss[loss=0.2679, simple_loss=0.3218, pruned_loss=0.107, over 4717900.76 frames. ], batch size: 74, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:24:05,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:24:06,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:24:06,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 19:24:10,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:24:20,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:21,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:24:27,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:24:28,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:24:30,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:30,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:24:31,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 19:24:34,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:24:34,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:38,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:24:39,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:24:41,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 19:24:43,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 19:24:43,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:45,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 19:24:46,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:24:56,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:24:58,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:24:59,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:25:00,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:25:00,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 19:25:01,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:25:03,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:03,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 19:25:04,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:04,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:04,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:04,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:06,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=117840.0, ans=0.04949747468305833 2023-09-28 19:25:07,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:07,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:25:09,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:09,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:25:09,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:13,740 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=15.14 vs. limit=10.0 2023-09-28 19:25:14,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:16,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 19:25:18,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:18,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:19,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 19:25:24,970 INFO [train.py:1039] (0/4) Epoch 4, batch 1750, loss[loss=0.262, simple_loss=0.3338, pruned_loss=0.0951, over 24424.00 frames. ], tot_loss[loss=0.2671, simple_loss=0.3205, pruned_loss=0.1068, over 4705633.89 frames. ], batch size: 69, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:25:26,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:28,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:28,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:25:30,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 19:25:31,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:34,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:25:34,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:39,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 19:25:40,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:44,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 19:25:44,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:46,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:25:48,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:25:51,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 19:25:51,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:53,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 19:26:03,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:26:06,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:06,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:12,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:12,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:13,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=118106.66666666667, ans=0.125 2023-09-28 19:26:14,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:26:16,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:18,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:19,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:26:19,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 19:26:21,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:24,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 19:26:24,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:26,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:27,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:26:32,243 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.48 vs. limit=15.0 2023-09-28 19:26:32,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:26:32,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:26:32,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:35,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:39,664 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.369e+02 2.693e+02 3.225e+02 5.418e+02, threshold=5.386e+02, percent-clipped=0.0 2023-09-28 19:26:40,166 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:26:41,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:44,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:26:44,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:26:45,313 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.76 vs. limit=6.0 2023-09-28 19:26:47,450 INFO [train.py:1039] (0/4) Epoch 4, batch 1800, loss[loss=0.2773, simple_loss=0.3249, pruned_loss=0.1148, over 23550.00 frames. ], tot_loss[loss=0.2664, simple_loss=0.3201, pruned_loss=0.1063, over 4713017.97 frames. ], batch size: 256, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:26:47,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 19:26:47,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:48,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:26:48,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:26:49,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:26:49,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:26:50,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:26:52,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:26:54,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:55,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:26:58,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:01,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:27:04,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:27:07,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:08,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=118306.66666666667, ans=0.1 2023-09-28 19:27:10,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:10,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:12,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:27:14,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:27:14,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 19:27:15,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:18,784 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.43 vs. limit=15.0 2023-09-28 19:27:19,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:21,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 19:27:24,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 19:27:24,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 19:27:24,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:26,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:26,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:27:27,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:27:34,685 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 19:27:34,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:27:36,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:38,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 19:27:38,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 19:27:40,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:27:41,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:27:41,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:27:42,675 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.73 vs. limit=15.0 2023-09-28 19:27:46,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 19:27:51,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:27:51,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 19:27:51,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:53,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:53,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:27:53,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 19:27:53,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=118506.66666666667, ans=0.0 2023-09-28 19:27:58,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:27:58,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:28:01,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 19:28:01,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:03,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:04,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:28:04,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:07,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:07,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:28:10,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:28:10,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:11,775 INFO [train.py:1039] (0/4) Epoch 4, batch 1850, loss[loss=0.2543, simple_loss=0.3182, pruned_loss=0.09523, over 24538.00 frames. ], tot_loss[loss=0.2673, simple_loss=0.3211, pruned_loss=0.1067, over 4717713.50 frames. ], batch size: 71, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:28:13,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:28:13,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:28:23,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:28:23,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 19:28:26,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 19:28:29,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 19:28:34,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:28:35,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 19:28:35,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 19:28:44,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:28:46,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 19:28:49,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:28:50,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:28:53,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=118706.66666666667, ans=0.125 2023-09-28 19:28:54,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 19:28:54,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:54,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:28:57,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:28:59,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:29:01,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:04,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:29:04,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:05,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:29:05,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:05,986 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=118773.33333333333, ans=0.125 2023-09-28 19:29:09,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:09,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:29:14,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 19:29:14,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:16,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=118773.33333333333, ans=0.04949747468305833 2023-09-28 19:29:19,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:29:19,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:29:19,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 19:29:19,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 19:29:21,618 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 19:29:21,739 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 19:29:24,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:29:24,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:29:24,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:24,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:24,797 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 19:29:24,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:29:26,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:27,990 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.551e+02 2.974e+02 3.413e+02 5.793e+02, threshold=5.947e+02, percent-clipped=2.0 2023-09-28 19:29:28,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:29:28,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:29:29,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:29:29,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 19:29:33,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:33,091 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 19:29:33,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:29:34,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:36,064 INFO [train.py:1039] (0/4) Epoch 4, batch 1900, loss[loss=0.2791, simple_loss=0.3396, pruned_loss=0.1093, over 24403.00 frames. ], tot_loss[loss=0.2675, simple_loss=0.3215, pruned_loss=0.1068, over 4717267.69 frames. ], batch size: 77, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:29:38,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=118906.66666666667, ans=0.2 2023-09-28 19:29:39,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:40,097 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.58 vs. limit=22.5 2023-09-28 19:29:41,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:29:43,586 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 19:29:43,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 19:29:45,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:46,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:46,625 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 19:29:46,691 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 19:29:52,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 19:29:54,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:29:54,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=118973.33333333333, ans=0.125 2023-09-28 19:29:57,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 19:30:00,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 19:30:04,251 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=118973.33333333333, ans=0.04949747468305833 2023-09-28 19:30:08,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 19:30:08,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=119040.0, ans=0.125 2023-09-28 19:30:11,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 19:30:12,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:30:12,999 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 19:30:13,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 19:30:14,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 19:30:14,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 19:30:14,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:30:14,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=119040.0, ans=0.0 2023-09-28 19:30:20,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 19:30:23,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:30:27,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:27,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 19:30:30,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:30:32,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 19:30:33,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:42,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:30:42,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:30:42,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:30:42,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:30:44,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:30:45,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:30:45,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:30:48,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:48,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:30:52,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:30:52,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:54,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:54,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:58,746 INFO [train.py:1039] (0/4) Epoch 4, batch 1950, loss[loss=0.2573, simple_loss=0.3048, pruned_loss=0.1049, over 24431.00 frames. ], tot_loss[loss=0.2695, simple_loss=0.3226, pruned_loss=0.1082, over 4714225.55 frames. ], batch size: 58, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:30:58,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:30:59,542 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.85 vs. limit=15.0 2023-09-28 19:31:01,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:31:03,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:03,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:31:05,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 19:31:06,556 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.31 vs. limit=15.0 2023-09-28 19:31:06,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:31:07,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:10,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:13,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:31:13,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:13,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:17,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:18,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:31:18,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:31:20,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:31:20,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:22,100 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=119306.66666666667, ans=0.025 2023-09-28 19:31:24,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:27,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:31:27,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:27,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:31:27,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 19:31:29,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:31:29,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:31:29,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:36,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:36,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=119373.33333333333, ans=0.125 2023-09-28 19:31:37,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:31:41,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=119373.33333333333, ans=0.0 2023-09-28 19:31:43,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:31:46,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:31:46,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:31:46,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 19:31:46,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:31:52,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:52,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:31:52,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:00,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:01,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:06,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:08,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:09,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:32:11,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:11,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 19:32:11,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:32:12,782 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.624e+02 2.939e+02 3.496e+02 6.198e+02, threshold=5.878e+02, percent-clipped=1.0 2023-09-28 19:32:12,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:32:14,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 19:32:17,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:21,592 INFO [train.py:1039] (0/4) Epoch 4, batch 2000, loss[loss=0.2276, simple_loss=0.2982, pruned_loss=0.07854, over 24479.00 frames. ], tot_loss[loss=0.2697, simple_loss=0.3232, pruned_loss=0.1081, over 4711002.22 frames. ], batch size: 66, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:32:21,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:22,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=119573.33333333333, ans=0.2 2023-09-28 19:32:23,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:32:23,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:32:26,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:32:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:31,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 19:32:31,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:32:33,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=119573.33333333333, ans=0.0 2023-09-28 19:32:34,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:32:34,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=119573.33333333333, ans=0.125 2023-09-28 19:32:36,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 19:32:36,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:32:36,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:39,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:32:41,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 19:32:43,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:47,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 19:32:47,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:32:49,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 19:32:49,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:32:50,174 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=119640.0, ans=0.0 2023-09-28 19:32:53,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:32:53,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:32:53,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:55,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:32:55,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:32:56,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 19:33:00,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 19:33:00,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:33:00,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:00,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=119706.66666666667, ans=0.1 2023-09-28 19:33:06,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:06,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=119706.66666666667, ans=10.0 2023-09-28 19:33:07,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:33:08,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:09,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:33:11,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:11,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:13,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:13,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:16,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:19,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:33:19,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 19:33:24,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:33:27,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:32,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:32,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:33:35,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:39,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:39,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:39,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:33:39,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:33:42,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:44,038 INFO [train.py:1039] (0/4) Epoch 4, batch 2050, loss[loss=0.2907, simple_loss=0.3522, pruned_loss=0.1146, over 24465.00 frames. ], tot_loss[loss=0.2687, simple_loss=0.3222, pruned_loss=0.1076, over 4718952.77 frames. ], batch size: 69, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:33:44,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:47,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:47,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=119906.66666666667, ans=0.125 2023-09-28 19:33:48,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:52,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=119906.66666666667, ans=0.125 2023-09-28 19:33:55,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:57,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:33:57,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:58,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:02,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 19:34:02,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:34:02,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:04,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:34:09,066 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.60 vs. limit=12.0 2023-09-28 19:34:14,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:14,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:17,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 19:34:19,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:20,325 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.83 vs. limit=15.0 2023-09-28 19:34:20,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 19:34:21,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:24,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:27,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:29,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:34:29,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:30,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:34:31,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:34:31,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:34:32,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=120106.66666666667, ans=0.125 2023-09-28 19:34:36,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:37,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:34:39,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:34:39,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:44,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:34:47,002 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.07 vs. limit=15.0 2023-09-28 19:34:49,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:50,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 19:34:53,157 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.38 vs. limit=15.0 2023-09-28 19:34:55,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:34:57,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:34:59,046 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.560e+02 3.020e+02 3.672e+02 5.923e+02, threshold=6.041e+02, percent-clipped=1.0 2023-09-28 19:35:00,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:35:02,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 19:35:05,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=120240.0, ans=0.125 2023-09-28 19:35:06,688 INFO [train.py:1039] (0/4) Epoch 4, batch 2100, loss[loss=0.2538, simple_loss=0.3223, pruned_loss=0.0927, over 24453.00 frames. ], tot_loss[loss=0.267, simple_loss=0.3202, pruned_loss=0.1069, over 4708842.62 frames. ], batch size: 69, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:35:06,866 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 19:35:06,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:06,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:08,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:09,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:35:09,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 19:35:10,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 19:35:12,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:35:15,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:35:17,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:35:20,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:22,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:35:22,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 19:35:22,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:35:23,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 19:35:23,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 19:35:25,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:26,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:35:26,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 19:35:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 19:35:32,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 19:35:32,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:37,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:35:37,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:40,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:35:40,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 19:35:41,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:41,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:35:43,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 19:35:43,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:43,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 19:35:45,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 19:35:45,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 19:35:48,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:35:50,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:35:53,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:54,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:57,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:59,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:59,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 19:35:59,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:59,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:00,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:00,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 19:36:02,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 19:36:02,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 19:36:02,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=120440.0, ans=0.1 2023-09-28 19:36:06,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:36:09,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:36:09,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=120440.0, ans=0.125 2023-09-28 19:36:10,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 19:36:14,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:17,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:36:18,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:36:18,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:36:18,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:36:20,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:36:23,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:23,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:36:23,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:36:23,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:27,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 19:36:30,138 INFO [train.py:1039] (0/4) Epoch 4, batch 2150, loss[loss=0.2729, simple_loss=0.3317, pruned_loss=0.107, over 24015.00 frames. ], tot_loss[loss=0.266, simple_loss=0.3194, pruned_loss=0.1063, over 4709675.95 frames. ], batch size: 80, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:36:30,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 19:36:30,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:31,184 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=15.0 2023-09-28 19:36:31,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:31,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:36:31,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:36:32,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=120573.33333333333, ans=0.125 2023-09-28 19:36:33,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:36:37,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:36:38,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:41,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:44,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:36:44,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:44,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:36:47,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:49,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:36:49,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:36:52,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:52,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 19:36:52,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=120640.0, ans=0.125 2023-09-28 19:36:57,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:00,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:37:00,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:02,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:02,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:02,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:37:02,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:02,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:37:04,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:37:04,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 19:37:06,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:37:08,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:08,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:09,331 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=120706.66666666667, ans=0.125 2023-09-28 19:37:10,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:37:12,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:37:14,141 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.15 vs. limit=10.0 2023-09-28 19:37:15,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:16,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:37:16,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:16,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 19:37:18,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:37:21,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:21,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:23,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:24,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:37:24,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:26,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:26,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 19:37:27,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 19:37:27,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:37:27,847 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 19:37:29,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:30,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:37:31,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 19:37:31,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:37:32,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 19:37:32,927 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 19:37:32,927 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 19:37:32,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 19:37:35,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:36,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:36,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:37:37,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:38,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:37:38,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:40,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:40,729 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=120840.0, ans=0.1 2023-09-28 19:37:45,483 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.345e+02 2.868e+02 3.477e+02 5.291e+02, threshold=5.737e+02, percent-clipped=0.0 2023-09-28 19:37:47,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:37:48,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 19:37:53,285 INFO [train.py:1039] (0/4) Epoch 4, batch 2200, loss[loss=0.2884, simple_loss=0.345, pruned_loss=0.1159, over 24052.00 frames. ], tot_loss[loss=0.2666, simple_loss=0.3201, pruned_loss=0.1066, over 4718410.79 frames. ], batch size: 80, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:37:53,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:37:58,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:58,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:37:59,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:01,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:38:04,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:38:04,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:38:04,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 19:38:11,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 19:38:13,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:38:15,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=120973.33333333333, ans=0.2 2023-09-28 19:38:20,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 19:38:23,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:25,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:25,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:38:28,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:38:30,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 19:38:30,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=121040.0, ans=0.125 2023-09-28 19:38:30,898 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.48 vs. limit=15.0 2023-09-28 19:38:31,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:38:34,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:34,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 19:38:38,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:38:39,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:43,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:38:44,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:48,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 19:38:48,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:49,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 19:38:51,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:51,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:38:51,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:53,729 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=121106.66666666667, ans=0.0 2023-09-28 19:38:55,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:55,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=121106.66666666667, ans=0.125 2023-09-28 19:38:56,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:56,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:56,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:58,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:38:58,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:38:59,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:39:01,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=121173.33333333333, ans=0.125 2023-09-28 19:39:02,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:39:04,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:07,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:39:07,563 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 19:39:10,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:39:11,983 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 19:39:12,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:39:14,062 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 19:39:15,506 INFO [train.py:1039] (0/4) Epoch 4, batch 2250, loss[loss=0.2836, simple_loss=0.3433, pruned_loss=0.1119, over 24057.00 frames. ], tot_loss[loss=0.2674, simple_loss=0.3211, pruned_loss=0.1069, over 4716424.49 frames. ], batch size: 80, lr: 2.42e-02, grad_scale: 64.0 2023-09-28 19:39:15,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:17,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:39:19,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:19,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=121240.0, ans=0.04949747468305833 2023-09-28 19:39:19,771 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.59 vs. limit=12.0 2023-09-28 19:39:20,738 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 19:39:20,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:39:24,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:28,457 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.76 vs. limit=15.0 2023-09-28 19:39:28,465 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=12.0 2023-09-28 19:39:30,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:39:32,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:39:38,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:38,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:38,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:40,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 19:39:40,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:39:41,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:39:42,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 19:39:42,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:39:42,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:45,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:49,822 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.01 vs. limit=15.0 2023-09-28 19:39:50,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:52,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:39:52,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:39:54,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 19:39:54,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=121373.33333333333, ans=0.125 2023-09-28 19:39:57,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:58,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:40:02,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:04,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:05,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:05,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:40:08,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:40:10,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:40:13,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:40:16,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:40:21,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:40:21,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:40:21,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:40:27,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:40:31,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:40:31,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 19:40:31,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=121506.66666666667, ans=0.0 2023-09-28 19:40:31,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=121506.66666666667, ans=15.0 2023-09-28 19:40:32,080 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.293e+02 2.607e+02 2.902e+02 3.937e+02, threshold=5.215e+02, percent-clipped=0.0 2023-09-28 19:40:32,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:32,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:40:37,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 19:40:38,977 INFO [train.py:1039] (0/4) Epoch 4, batch 2300, loss[loss=0.2371, simple_loss=0.2935, pruned_loss=0.09039, over 19956.00 frames. ], tot_loss[loss=0.2694, simple_loss=0.3228, pruned_loss=0.1079, over 4711441.51 frames. ], batch size: 43, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:40:39,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:40:40,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:45,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:46,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:40:50,026 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 19:40:50,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=121573.33333333333, ans=0.1 2023-09-28 19:40:51,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:57,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:40:57,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:40:59,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:40:59,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:41:00,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 19:41:02,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:41:03,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:05,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:41:08,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:41:12,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:41:15,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:20,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:41:21,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:41:26,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:41:27,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:41:30,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:32,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:41:32,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:41:32,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 19:41:37,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:41:37,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:38,346 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=15.0 2023-09-28 19:41:39,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:39,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:41:40,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:42,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 19:41:42,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:41:42,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 19:41:42,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:41:42,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:44,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 19:41:49,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:41:54,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:41:58,189 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-09-28 19:41:59,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:59,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:41:59,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:42:00,451 INFO [train.py:1039] (0/4) Epoch 4, batch 2350, loss[loss=0.257, simple_loss=0.3146, pruned_loss=0.09965, over 24462.00 frames. ], tot_loss[loss=0.2695, simple_loss=0.323, pruned_loss=0.108, over 4714283.81 frames. ], batch size: 63, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:42:00,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:42:00,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:02,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:42:02,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 19:42:07,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=121906.66666666667, ans=0.2 2023-09-28 19:42:10,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:10,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 19:42:10,958 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=121906.66666666667, ans=0.125 2023-09-28 19:42:10,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=121906.66666666667, ans=0.125 2023-09-28 19:42:14,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=121906.66666666667, ans=0.0 2023-09-28 19:42:18,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 19:42:21,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:42:24,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:24,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:26,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 19:42:26,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=121973.33333333333, ans=0.1 2023-09-28 19:42:29,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:42:34,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 19:42:35,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:38,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:42:40,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:42,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:42:44,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 19:42:44,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:42:46,500 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=15.0 2023-09-28 19:42:47,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:47,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:42:47,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:42:52,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:42:54,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 19:42:56,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:58,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:58,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:42:59,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 19:43:01,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:43:03,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 19:43:04,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:43:06,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=122173.33333333333, ans=0.1 2023-09-28 19:43:07,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=122173.33333333333, ans=0.1 2023-09-28 19:43:09,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 19:43:10,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 19:43:12,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:43:12,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:43:12,532 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 19:43:12,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 19:43:16,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 19:43:18,092 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.367e+02 2.843e+02 3.356e+02 5.882e+02, threshold=5.686e+02, percent-clipped=1.0 2023-09-28 19:43:18,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:43:21,665 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:43:22,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:43:24,089 INFO [train.py:1039] (0/4) Epoch 4, batch 2400, loss[loss=0.2735, simple_loss=0.3322, pruned_loss=0.1074, over 23706.00 frames. ], tot_loss[loss=0.27, simple_loss=0.3231, pruned_loss=0.1084, over 4707514.02 frames. ], batch size: 85, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:43:28,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:43:29,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:43:31,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 19:43:31,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 19:43:31,822 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=122240.0, ans=0.125 2023-09-28 19:43:36,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=122240.0, ans=10.0 2023-09-28 19:43:36,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=122240.0, ans=0.2 2023-09-28 19:43:37,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=122240.0, ans=0.1 2023-09-28 19:43:39,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:43:39,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:43:40,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 19:43:40,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:43:42,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:42,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 19:43:49,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:52,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 19:43:57,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:44:02,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 19:44:04,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:07,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:12,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:12,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 19:44:12,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:44:18,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:22,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:44:24,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:25,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:44:25,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:44:27,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:44:27,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:27,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:29,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:44:30,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=122506.66666666667, ans=0.1 2023-09-28 19:44:32,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:44:34,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:44:34,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 19:44:36,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 19:44:40,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:40,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:40,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 19:44:41,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 19:44:41,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 19:44:41,824 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 19:44:43,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 19:44:44,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:45,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:45,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:46,684 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 19:44:46,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:47,051 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=122573.33333333333, ans=0.125 2023-09-28 19:44:48,027 INFO [train.py:1039] (0/4) Epoch 4, batch 2450, loss[loss=0.2705, simple_loss=0.2879, pruned_loss=0.1266, over 19426.00 frames. ], tot_loss[loss=0.2682, simple_loss=0.3215, pruned_loss=0.1075, over 4715387.19 frames. ], batch size: 388, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:44:48,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:44:51,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:44:51,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:56,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:56,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:58,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 19:45:02,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:02,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:05,016 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=122640.0, ans=0.125 2023-09-28 19:45:06,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:45:08,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:45:08,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:45:08,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 19:45:11,466 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.93 vs. limit=12.0 2023-09-28 19:45:13,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:15,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:45:16,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:45:17,073 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=122640.0, ans=0.1 2023-09-28 19:45:20,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:45:20,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:23,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:23,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:45:24,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 19:45:26,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:45:34,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:36,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:36,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:36,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:45:38,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:39,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:45:40,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 19:45:45,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:45,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:45:48,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:45:48,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:48,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=122773.33333333333, ans=0.125 2023-09-28 19:45:54,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:45:54,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 19:45:56,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:45:56,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:56,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 19:45:57,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:45:58,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:45:59,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=122840.0, ans=0.125 2023-09-28 19:46:02,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:46:02,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=122840.0, ans=0.125 2023-09-28 19:46:03,794 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.418e+02 2.743e+02 3.147e+02 4.422e+02, threshold=5.485e+02, percent-clipped=0.0 2023-09-28 19:46:05,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:05,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:46:09,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 19:46:09,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:46:11,544 INFO [train.py:1039] (0/4) Epoch 4, batch 2500, loss[loss=0.2507, simple_loss=0.3065, pruned_loss=0.09742, over 24304.00 frames. ], tot_loss[loss=0.2667, simple_loss=0.3196, pruned_loss=0.1069, over 4716306.29 frames. ], batch size: 56, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:46:17,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=122906.66666666667, ans=0.0 2023-09-28 19:46:19,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:23,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=122906.66666666667, ans=0.0 2023-09-28 19:46:28,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:46:28,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:46:28,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=122973.33333333333, ans=0.0 2023-09-28 19:46:29,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:29,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 19:46:33,801 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.53 vs. limit=15.0 2023-09-28 19:46:37,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:46:37,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:46:38,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:46:38,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:46:40,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 19:46:40,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:42,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:44,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 19:46:44,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:44,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 19:46:44,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:46:50,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:52,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:56,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:46:56,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 19:46:56,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:46:59,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:04,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:07,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:12,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:15,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:47:15,828 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=123173.33333333333, ans=0.1 2023-09-28 19:47:17,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 19:47:17,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:19,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:47:20,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:47:20,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:47:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 19:47:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 19:47:22,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 19:47:25,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:29,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 19:47:29,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 19:47:30,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:30,780 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.49 vs. limit=22.5 2023-09-28 19:47:31,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 19:47:32,055 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.68 vs. limit=10.0 2023-09-28 19:47:34,347 INFO [train.py:1039] (0/4) Epoch 4, batch 2550, loss[loss=0.2503, simple_loss=0.3139, pruned_loss=0.09332, over 24296.00 frames. ], tot_loss[loss=0.2664, simple_loss=0.3196, pruned_loss=0.1066, over 4721912.71 frames. ], batch size: 61, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:47:36,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 19:47:37,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:39,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:47:40,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:47:41,281 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=123240.0, ans=0.125 2023-09-28 19:47:42,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:44,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 19:47:45,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:47:48,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 19:47:50,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:47:50,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=123306.66666666667, ans=0.2 2023-09-28 19:47:50,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=123306.66666666667, ans=0.125 2023-09-28 19:47:54,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:55,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=123306.66666666667, ans=0.1 2023-09-28 19:47:56,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:56,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 19:47:56,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:47:58,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:47:58,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:58,637 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=123306.66666666667, ans=0.1 2023-09-28 19:48:02,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:48:02,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 19:48:02,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:48:02,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:02,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 19:48:06,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=123373.33333333333, ans=0.125 2023-09-28 19:48:13,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:48:13,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=123373.33333333333, ans=0.125 2023-09-28 19:48:17,285 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.86 vs. limit=15.0 2023-09-28 19:48:18,699 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.25 vs. limit=12.0 2023-09-28 19:48:19,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:19,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:19,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:48:21,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:48:27,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:48:30,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:48:30,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:48:31,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:48:31,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:48:31,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=123440.0, ans=0.125 2023-09-28 19:48:32,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:48:35,788 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.37 vs. limit=15.0 2023-09-28 19:48:36,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:38,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:44,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:48:44,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 19:48:44,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:48:44,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:46,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:48:47,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:48:47,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:49,403 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.344e+02 2.648e+02 3.019e+02 5.195e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 19:48:54,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:48:55,603 INFO [train.py:1039] (0/4) Epoch 4, batch 2600, loss[loss=0.2861, simple_loss=0.3337, pruned_loss=0.1193, over 23680.00 frames. ], tot_loss[loss=0.2668, simple_loss=0.3199, pruned_loss=0.1069, over 4718580.17 frames. ], batch size: 232, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:48:55,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:58,998 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 19:49:02,116 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 19:49:02,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:49:04,144 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 19:49:04,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 19:49:04,296 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 19:49:07,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:49:07,450 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 19:49:09,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 19:49:11,072 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 19:49:12,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:49:14,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 19:49:16,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 19:49:16,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=123640.0, ans=0.2 2023-09-28 19:49:16,828 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=123640.0, ans=0.125 2023-09-28 19:49:17,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:49:18,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 19:49:19,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=123640.0, ans=0.125 2023-09-28 19:49:20,942 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 19:49:20,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 19:49:27,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:27,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:27,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:27,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 19:49:28,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:49:37,228 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 19:49:41,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:41,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:42,573 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.29 vs. limit=22.5 2023-09-28 19:49:43,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 19:49:44,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:49:44,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:44,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 19:49:50,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:49:50,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:49:53,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 19:49:56,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:50:01,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:50:01,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:50:01,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 19:50:03,088 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:50:04,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:50:05,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:07,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:14,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 19:50:16,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:18,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:50:19,990 INFO [train.py:1039] (0/4) Epoch 4, batch 2650, loss[loss=0.23, simple_loss=0.29, pruned_loss=0.08495, over 24379.00 frames. ], tot_loss[loss=0.2675, simple_loss=0.3208, pruned_loss=0.1071, over 4725594.85 frames. ], batch size: 56, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:50:21,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 19:50:21,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:21,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:50:24,068 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 19:50:24,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:50:26,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:29,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:50:30,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:33,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:50:33,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 19:50:33,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:50:33,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:50:37,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 19:50:39,496 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 19:50:43,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:48,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 19:50:48,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:50:48,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 19:50:53,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:50:53,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:50:57,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=124040.0, ans=0.04949747468305833 2023-09-28 19:51:01,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 19:51:01,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 19:51:03,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:07,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 19:51:07,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:09,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:09,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:11,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:11,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:14,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:15,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:16,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=124106.66666666667, ans=0.125 2023-09-28 19:51:17,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:51:17,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:51:19,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:51:20,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:21,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:51:23,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:24,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:24,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:51:27,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:27,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=124173.33333333333, ans=0.95 2023-09-28 19:51:29,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:51:29,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:31,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 19:51:35,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:35,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:36,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=124173.33333333333, ans=0.2 2023-09-28 19:51:37,253 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.373e+02 3.005e+02 3.591e+02 5.745e+02, threshold=6.010e+02, percent-clipped=4.0 2023-09-28 19:51:39,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:40,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:42,340 INFO [train.py:1039] (0/4) Epoch 4, batch 2700, loss[loss=0.2504, simple_loss=0.3211, pruned_loss=0.08988, over 24620.00 frames. ], tot_loss[loss=0.2676, simple_loss=0.3213, pruned_loss=0.1069, over 4729688.46 frames. ], batch size: 68, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:51:42,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:42,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:44,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:51:44,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 19:51:48,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:51:50,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 19:51:51,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:51,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:53,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:55,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:51:55,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:55,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:51:57,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:51:57,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 19:51:57,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:51:58,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:58,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:52:00,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:05,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:52:05,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 19:52:07,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:52:12,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:52:12,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:52:12,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=124306.66666666667, ans=0.07 2023-09-28 19:52:18,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:52:18,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:52:18,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:52:18,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:52:21,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:26,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:26,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:52:26,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:52:31,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:31,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:52:40,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:52:40,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:52:45,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:52:45,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:52:48,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:48,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:50,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:52,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:55,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:55,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:52:58,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:53:00,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:00,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:02,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 19:53:03,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:05,057 INFO [train.py:1039] (0/4) Epoch 4, batch 2750, loss[loss=0.26, simple_loss=0.3229, pruned_loss=0.09856, over 24449.00 frames. ], tot_loss[loss=0.2675, simple_loss=0.3214, pruned_loss=0.1069, over 4733779.54 frames. ], batch size: 63, lr: 2.39e-02, grad_scale: 16.0 2023-09-28 19:53:06,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:53:06,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 19:53:08,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 19:53:08,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:10,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:10,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:15,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:15,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:53:15,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:18,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:20,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:53:20,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:53:20,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:20,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 19:53:20,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:53:20,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:20,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=124640.0, ans=0.125 2023-09-28 19:53:20,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=124640.0, ans=0.0 2023-09-28 19:53:27,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 19:53:30,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:53:30,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:30,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:53:32,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:53:32,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:33,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:53:35,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:35,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:38,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:53:38,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:53:39,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:53:40,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:43,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:53:50,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:52,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:53:52,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:52,959 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=124706.66666666667, ans=6.0 2023-09-28 19:53:53,073 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.91 vs. limit=15.0 2023-09-28 19:53:56,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:56,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:53:58,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:54:04,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:54:04,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:54:04,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 19:54:07,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=124773.33333333333, ans=0.125 2023-09-28 19:54:10,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:11,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 19:54:17,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:54:19,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:54:19,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 19:54:20,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:54:22,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:54:24,078 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.875e+02 2.756e+02 3.214e+02 3.920e+02 6.552e+02, threshold=6.428e+02, percent-clipped=3.0 2023-09-28 19:54:24,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 19:54:25,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:54:28,567 INFO [train.py:1039] (0/4) Epoch 4, batch 2800, loss[loss=0.2874, simple_loss=0.3356, pruned_loss=0.1196, over 23374.00 frames. ], tot_loss[loss=0.2663, simple_loss=0.3206, pruned_loss=0.106, over 4741484.89 frames. ], batch size: 106, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:54:28,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 19:54:30,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:30,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:54:30,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 19:54:30,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:31,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:33,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:33,515 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 19:54:33,516 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 19:54:38,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:40,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:54:41,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:54:43,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:54:47,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 19:54:48,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 19:54:50,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 19:54:50,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:50,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:54:50,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:54:55,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:54:55,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:55,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:54:57,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:01,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=125040.0, ans=0.1 2023-09-28 19:55:07,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:55:08,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:55:11,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:13,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:55:13,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:19,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:19,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 19:55:21,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:21,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:23,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:55:26,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:26,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:32,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:36,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:55:36,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:36,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:55:36,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:55:37,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:55:37,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:55:37,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 19:55:39,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:40,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:40,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:42,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 19:55:42,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:42,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:55:42,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=125173.33333333333, ans=0.1 2023-09-28 19:55:43,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:55:44,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 19:55:52,388 INFO [train.py:1039] (0/4) Epoch 4, batch 2850, loss[loss=0.2509, simple_loss=0.3031, pruned_loss=0.09933, over 23909.00 frames. ], tot_loss[loss=0.2653, simple_loss=0.3188, pruned_loss=0.1059, over 4721516.80 frames. ], batch size: 195, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:55:52,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:52,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:55:53,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.57 vs. limit=15.0 2023-09-28 19:55:54,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:55:57,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:00,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:00,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:00,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:56:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:04,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:56:05,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:56:07,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 19:56:15,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 19:56:15,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:17,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 19:56:18,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:21,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 19:56:21,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 19:56:22,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:31,489 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=125373.33333333333, ans=0.0 2023-09-28 19:56:35,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:36,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=125373.33333333333, ans=0.0 2023-09-28 19:56:37,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:37,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:37,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:56:37,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:56:38,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:56:40,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:56:40,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 19:56:43,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:56:43,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:56:45,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:45,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:48,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:48,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:50,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:53,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:55,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:56:55,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:56,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:58,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:57:03,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:57:05,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 19:57:05,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 19:57:06,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:57:08,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:08,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 19:57:09,325 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.413e+02 2.730e+02 3.344e+02 4.987e+02, threshold=5.460e+02, percent-clipped=0.0 2023-09-28 19:57:09,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:57:10,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:10,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:10,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:57:10,966 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 19:57:12,299 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 19:57:12,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:13,737 INFO [train.py:1039] (0/4) Epoch 4, batch 2900, loss[loss=0.2804, simple_loss=0.3304, pruned_loss=0.1152, over 23219.00 frames. ], tot_loss[loss=0.2646, simple_loss=0.3183, pruned_loss=0.1055, over 4709368.47 frames. ], batch size: 105, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:57:13,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:15,717 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=125573.33333333333, ans=0.0 2023-09-28 19:57:18,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:18,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:20,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:57:20,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 19:57:25,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:25,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 19:57:26,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 19:57:30,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:57:30,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:57:30,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:32,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:57:36,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:37,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:40,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:57:40,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 19:57:42,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:57:43,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:45,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 19:57:47,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 19:57:50,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:50,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 19:57:50,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:57:53,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:57:53,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:56,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:56,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:01,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:58:03,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:07,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 19:58:07,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 19:58:07,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:58:09,972 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.29 vs. limit=15.0 2023-09-28 19:58:10,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:58:15,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 19:58:15,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:58:19,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:27,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:58:27,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:58:29,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 19:58:31,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=125840.0, ans=0.125 2023-09-28 19:58:31,880 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-09-28 19:58:32,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:34,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 19:58:34,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:34,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:58:35,887 INFO [train.py:1039] (0/4) Epoch 4, batch 2950, loss[loss=0.2415, simple_loss=0.3007, pruned_loss=0.09118, over 24306.00 frames. ], tot_loss[loss=0.2652, simple_loss=0.3193, pruned_loss=0.1056, over 4716449.30 frames. ], batch size: 56, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:58:41,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:43,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 19:58:43,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:43,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:46,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:58:48,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:58:48,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 19:58:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 19:58:50,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:58:50,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:57,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:58:58,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=125973.33333333333, ans=0.0 2023-09-28 19:58:58,974 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.20 vs. limit=22.5 2023-09-28 19:58:59,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:58:59,690 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=125973.33333333333, ans=0.125 2023-09-28 19:59:01,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:01,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:04,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:04,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:59:05,584 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.79 vs. limit=15.0 2023-09-28 19:59:06,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:59:12,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 19:59:16,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 19:59:16,033 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 19:59:16,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:59:18,228 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 19:59:20,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 19:59:21,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:21,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:59:21,104 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 19:59:21,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:59:22,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 19:59:23,487 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=24.04 vs. limit=22.5 2023-09-28 19:59:24,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:59:25,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:59:27,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:28,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:59:28,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:29,387 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.79 vs. limit=22.5 2023-09-28 19:59:30,301 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 19:59:30,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=126106.66666666667, ans=0.2 2023-09-28 19:59:31,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:31,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 19:59:37,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:38,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:59:40,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 19:59:40,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:59:41,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 19:59:45,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:47,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:47,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:59:50,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:50,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:59:52,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:59:53,510 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.776e+02 2.372e+02 2.758e+02 3.353e+02 4.666e+02, threshold=5.516e+02, percent-clipped=0.0 2023-09-28 19:59:53,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:53,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:59:53,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:59:53,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:55,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:59:56,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:56,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 19:59:58,287 INFO [train.py:1039] (0/4) Epoch 4, batch 3000, loss[loss=0.2675, simple_loss=0.3345, pruned_loss=0.1002, over 24333.00 frames. ], tot_loss[loss=0.2673, simple_loss=0.3209, pruned_loss=0.1068, over 4707570.93 frames. ], batch size: 74, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:59:58,288 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 20:00:07,895 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.2.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([5.0022, 3.8925, 4.7574, 4.2369], device='cuda:0') 2023-09-28 20:00:13,217 INFO [train.py:1071] (0/4) Epoch 4, validation: loss=0.3352, simple_loss=0.3262, pruned_loss=0.1721, over 1125622.00 frames. 2023-09-28 20:00:13,218 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 20:00:13,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:00:15,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:00:16,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:00:19,650 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 20:00:19,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 20:00:23,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:00:23,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:00:25,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 20:00:25,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:31,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:00:40,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:00:42,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=126306.66666666667, ans=0.125 2023-09-28 20:00:46,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=126373.33333333333, ans=0.0 2023-09-28 20:00:47,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 20:00:49,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:00:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:00:52,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:52,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:00:56,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:00:56,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 20:00:59,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 20:01:01,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:01:01,977 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.76 vs. limit=10.0 2023-09-28 20:01:02,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:01:05,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:01:05,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:05,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:05,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:01:09,715 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.84 vs. limit=15.0 2023-09-28 20:01:10,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=126440.0, ans=0.2 2023-09-28 20:01:11,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:01:11,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:01:11,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:01:13,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:15,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 20:01:16,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:01:16,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:16,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:01:17,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=126506.66666666667, ans=0.125 2023-09-28 20:01:20,458 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=126506.66666666667, ans=0.2 2023-09-28 20:01:22,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:22,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:23,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:01:23,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 20:01:25,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:01:25,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 20:01:26,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:01:28,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 20:01:28,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=126506.66666666667, ans=0.125 2023-09-28 20:01:32,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:01:32,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:01:32,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 20:01:34,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 20:01:34,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:01:35,613 INFO [train.py:1039] (0/4) Epoch 4, batch 3050, loss[loss=0.2392, simple_loss=0.2969, pruned_loss=0.09074, over 24372.00 frames. ], tot_loss[loss=0.2669, simple_loss=0.3209, pruned_loss=0.1064, over 4719341.88 frames. ], batch size: 56, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 20:01:35,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:01:37,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:37,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:01:37,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:37,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:01:38,062 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=12.0 2023-09-28 20:01:40,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 20:01:43,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:01:44,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:01:44,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:01:48,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:51,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 20:01:53,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=126640.0, ans=0.2 2023-09-28 20:01:56,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 20:01:56,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 20:01:56,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:00,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:02:05,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:05,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:07,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:10,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:10,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:02:10,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:12,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:12,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:12,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:15,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:19,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:19,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 20:02:19,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:19,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:02:22,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:02:24,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:02:24,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:02:25,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:32,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:32,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:38,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=126773.33333333333, ans=0.125 2023-09-28 20:02:41,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:42,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:02:42,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:42,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:43,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:02:43,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:44,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 20:02:45,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:45,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:47,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 20:02:48,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:52,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:53,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.370e+02 2.708e+02 3.419e+02 5.330e+02, threshold=5.417e+02, percent-clipped=0.0 2023-09-28 20:02:53,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:02:56,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:02:58,143 INFO [train.py:1039] (0/4) Epoch 4, batch 3100, loss[loss=0.2592, simple_loss=0.3167, pruned_loss=0.1008, over 23324.00 frames. ], tot_loss[loss=0.267, simple_loss=0.3204, pruned_loss=0.1069, over 4716125.35 frames. ], batch size: 93, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:02:59,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 20:03:01,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 20:03:01,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 20:03:04,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:03:07,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:03:07,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:09,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:03:14,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:20,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 20:03:22,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=126973.33333333333, ans=0.125 2023-09-28 20:03:28,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:03:28,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:29,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:03:29,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:03:31,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:03:32,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:03:33,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 20:03:33,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:03:33,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=127040.0, ans=0.2 2023-09-28 20:03:34,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:36,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 20:03:36,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:03:36,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=127040.0, ans=0.125 2023-09-28 20:03:41,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:03:41,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 20:03:43,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 20:03:45,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:45,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:48,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:03:48,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:48,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:03:52,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:03:52,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:03:52,492 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=127106.66666666667, ans=0.125 2023-09-28 20:03:52,888 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.57 vs. limit=15.0 2023-09-28 20:03:55,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:03:55,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:03:55,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:55,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:03:59,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:03:59,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=127106.66666666667, ans=0.125 2023-09-28 20:04:01,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 20:04:02,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:04:02,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 20:04:04,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:04,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:04,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 20:04:17,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 20:04:21,100 INFO [train.py:1039] (0/4) Epoch 4, batch 3150, loss[loss=0.2541, simple_loss=0.307, pruned_loss=0.1006, over 23602.00 frames. ], tot_loss[loss=0.2655, simple_loss=0.3194, pruned_loss=0.1058, over 4721441.76 frames. ], batch size: 149, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:04:21,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:21,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:22,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:04:22,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:04:24,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 20:04:24,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:24,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:04:26,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 20:04:26,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=127240.0, ans=0.1 2023-09-28 20:04:29,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:30,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=127240.0, ans=0.125 2023-09-28 20:04:32,601 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 20:04:35,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 20:04:35,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:04:37,554 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 20:04:37,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:04:39,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 20:04:39,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 20:04:39,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 20:04:39,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:41,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:04:43,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:44,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 20:04:46,449 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=127306.66666666667, ans=0.0 2023-09-28 20:04:47,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:47,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:48,187 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.00 vs. limit=22.5 2023-09-28 20:04:48,488 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.54 vs. limit=22.5 2023-09-28 20:04:48,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:50,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:04:52,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 20:04:54,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:04:56,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:04:58,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:58,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 20:05:01,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 20:05:02,311 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.75 vs. limit=22.5 2023-09-28 20:05:03,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:05:03,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:05:03,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:05:04,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:04,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:05:06,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:05:06,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:05:07,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 20:05:07,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:05:07,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:09,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:05:09,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:05:11,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 20:05:11,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:12,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 20:05:12,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:12,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 20:05:14,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 20:05:17,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:05:17,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:19,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 20:05:21,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:05:21,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:25,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:05:25,515 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=127440.0, ans=0.025 2023-09-28 20:05:26,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:26,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:05:33,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:05:34,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:38,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 20:05:38,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=127506.66666666667, ans=0.0 2023-09-28 20:05:39,693 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.347e+02 2.789e+02 3.421e+02 6.245e+02, threshold=5.579e+02, percent-clipped=5.0 2023-09-28 20:05:43,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:05:43,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 20:05:44,421 INFO [train.py:1039] (0/4) Epoch 4, batch 3200, loss[loss=0.254, simple_loss=0.2997, pruned_loss=0.1041, over 23618.00 frames. ], tot_loss[loss=0.264, simple_loss=0.3177, pruned_loss=0.1051, over 4728253.93 frames. ], batch size: 149, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:05:48,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:49,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:05:49,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 20:05:52,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:57,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:06:02,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:06:12,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:06:12,753 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=127640.0, ans=0.1 2023-09-28 20:06:23,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 20:06:23,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:06:27,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 20:06:29,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:06:29,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=127706.66666666667, ans=0.2 2023-09-28 20:06:32,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:06:32,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:06:32,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=127773.33333333333, ans=0.125 2023-09-28 20:06:35,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:06:38,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 20:06:39,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=127773.33333333333, ans=0.125 2023-09-28 20:06:40,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:06:43,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 20:06:45,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 20:06:48,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:06:54,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:54,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:06:54,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:55,547 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 20:06:55,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:06:59,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:06:59,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 20:07:01,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 20:07:01,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 20:07:02,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 20:07:02,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=127840.0, ans=0.0 2023-09-28 20:07:05,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:07:07,644 INFO [train.py:1039] (0/4) Epoch 4, batch 3250, loss[loss=0.2684, simple_loss=0.3194, pruned_loss=0.1087, over 23851.00 frames. ], tot_loss[loss=0.2634, simple_loss=0.318, pruned_loss=0.1044, over 4731075.13 frames. ], batch size: 179, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:07:09,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:07:09,345 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 20:07:09,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:09,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:12,409 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 20:07:16,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:07:17,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:29,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:07:29,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 20:07:30,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:30,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:07:32,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:32,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:32,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:07:32,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=127973.33333333333, ans=0.1 2023-09-28 20:07:34,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:34,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:07:34,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:36,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:07:37,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:39,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:41,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:41,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:43,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:44,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:44,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:07:48,941 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=6.76 vs. limit=15.0 2023-09-28 20:07:51,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 20:07:51,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:52,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:07:52,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:55,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:07:59,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:08:06,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:06,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:06,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 20:08:06,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:08:07,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:08:07,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:11,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 20:08:11,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 20:08:11,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:08:13,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:13,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:13,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=128173.33333333333, ans=0.125 2023-09-28 20:08:14,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:08:16,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:19,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:19,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:21,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 20:08:21,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:23,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:08:23,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 20:08:25,972 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.244e+02 2.605e+02 3.006e+02 4.571e+02, threshold=5.210e+02, percent-clipped=0.0 2023-09-28 20:08:26,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:26,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 20:08:27,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 20:08:29,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 20:08:29,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:30,579 INFO [train.py:1039] (0/4) Epoch 4, batch 3300, loss[loss=0.2658, simple_loss=0.3294, pruned_loss=0.1011, over 24487.00 frames. ], tot_loss[loss=0.2643, simple_loss=0.3186, pruned_loss=0.105, over 4726367.10 frames. ], batch size: 69, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:08:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:35,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:08:37,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:39,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:08:39,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:08:42,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:43,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:49,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 20:08:49,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:08:49,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:52,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:52,148 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 20:08:52,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=128306.66666666667, ans=0.1 2023-09-28 20:08:55,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:08:55,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:08:57,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:08:57,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:08:57,277 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 20:09:00,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:00,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:09:02,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:02,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 20:09:04,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:09:04,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:06,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:09:08,689 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 20:09:08,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=128373.33333333333, ans=0.1 2023-09-28 20:09:10,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 20:09:11,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:09:15,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 20:09:18,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:19,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:09:21,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:09:24,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:25,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:25,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:25,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:09:28,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:09:28,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:28,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=128440.0, ans=0.125 2023-09-28 20:09:30,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:09:31,867 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 20:09:34,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 20:09:37,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:09:39,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:09:39,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:40,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:40,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:43,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:09:43,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:43,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:09:45,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:46,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:09:50,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 20:09:50,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:09:50,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:51,894 INFO [train.py:1039] (0/4) Epoch 4, batch 3350, loss[loss=0.2568, simple_loss=0.329, pruned_loss=0.09232, over 24406.00 frames. ], tot_loss[loss=0.2639, simple_loss=0.319, pruned_loss=0.1044, over 4732040.80 frames. ], batch size: 69, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:09:53,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:09:53,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:55,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:56,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:56,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:00,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:10:00,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:04,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:10:06,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:09,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:10:09,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:10,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:10:10,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 20:10:12,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=128640.0, ans=0.125 2023-09-28 20:10:13,813 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 20:10:13,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:15,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 20:10:15,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 20:10:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:10:18,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:10:19,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:20,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 20:10:20,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=128640.0, ans=0.2 2023-09-28 20:10:21,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:21,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:10:23,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:26,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:26,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:28,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:10:32,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:33,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:33,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:37,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:10:37,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:37,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=128706.66666666667, ans=0.0 2023-09-28 20:10:39,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:39,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:43,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:45,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 20:10:45,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:10:45,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 20:10:46,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:10:46,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 20:10:48,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:49,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:57,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:57,738 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=128840.0, ans=0.0 2023-09-28 20:10:58,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 20:10:58,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:11:01,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:11:03,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:11:08,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:09,997 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.829e+02 2.437e+02 2.848e+02 3.538e+02 5.302e+02, threshold=5.697e+02, percent-clipped=3.0 2023-09-28 20:11:11,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 20:11:11,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:11:11,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:11:14,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:14,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 20:11:14,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:11:14,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 20:11:16,246 INFO [train.py:1039] (0/4) Epoch 4, batch 3400, loss[loss=0.2565, simple_loss=0.3259, pruned_loss=0.0935, over 24438.00 frames. ], tot_loss[loss=0.2653, simple_loss=0.3204, pruned_loss=0.1052, over 4732747.87 frames. ], batch size: 69, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:11:16,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:16,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:18,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:11:18,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:11:18,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 20:11:24,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 20:11:24,286 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 20:11:24,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:11:29,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=128906.66666666667, ans=0.0 2023-09-28 20:11:30,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:30,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:11:31,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:33,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:11:37,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:11:40,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 20:11:45,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:11:47,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:47,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:48,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:11:56,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:11:57,080 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=129040.0, ans=0.0 2023-09-28 20:12:01,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 20:12:02,114 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.66 vs. limit=15.0 2023-09-28 20:12:06,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 20:12:07,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:08,139 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=129106.66666666667, ans=0.1 2023-09-28 20:12:09,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:12:11,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:12:11,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:12:13,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:12:16,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:12:16,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:12:22,604 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.22 vs. limit=6.0 2023-09-28 20:12:23,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:25,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 20:12:34,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:12:37,268 INFO [train.py:1039] (0/4) Epoch 4, batch 3450, loss[loss=0.2539, simple_loss=0.3025, pruned_loss=0.1027, over 23710.00 frames. ], tot_loss[loss=0.2653, simple_loss=0.3199, pruned_loss=0.1054, over 4737910.90 frames. ], batch size: 149, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:12:39,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 20:12:42,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 20:12:42,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:44,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:12:44,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 20:12:45,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:51,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:12:55,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:12:57,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:12:59,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:12:59,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:01,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:02,233 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.40 vs. limit=6.0 2023-09-28 20:13:08,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 20:13:12,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 20:13:14,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:13:14,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:13:14,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=129373.33333333333, ans=0.04949747468305833 2023-09-28 20:13:15,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:16,682 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.32 vs. limit=22.5 2023-09-28 20:13:22,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 20:13:22,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:13:25,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:27,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:13:27,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=129440.0, ans=0.125 2023-09-28 20:13:30,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:13:30,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:13:32,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 20:13:32,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:13:33,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:37,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:13:40,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 20:13:43,265 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.89 vs. limit=15.0 2023-09-28 20:13:43,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:13:48,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:13:50,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:52,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:13:53,216 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.76 vs. limit=15.0 2023-09-28 20:13:55,854 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.310e+02 2.657e+02 3.151e+02 5.022e+02, threshold=5.313e+02, percent-clipped=0.0 2023-09-28 20:13:57,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:57,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:57,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:13:59,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:14:01,131 INFO [train.py:1039] (0/4) Epoch 4, batch 3500, loss[loss=0.2765, simple_loss=0.3236, pruned_loss=0.1147, over 23279.00 frames. ], tot_loss[loss=0.2645, simple_loss=0.3186, pruned_loss=0.1052, over 4731078.82 frames. ], batch size: 93, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:14:04,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:05,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:14:08,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 20:14:08,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=129573.33333333333, ans=0.125 2023-09-28 20:14:09,052 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.57 vs. limit=10.0 2023-09-28 20:14:11,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:14:12,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:14:14,447 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=129573.33333333333, ans=0.1 2023-09-28 20:14:15,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:15,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 20:14:22,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:14:23,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:14:23,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:14:23,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:25,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:14:25,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:25,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:25,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 20:14:29,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:31,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:14:31,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=129640.0, ans=0.125 2023-09-28 20:14:32,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:32,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=129706.66666666667, ans=0.0 2023-09-28 20:14:37,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:37,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 20:14:37,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:40,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:43,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:14:43,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:45,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:14:45,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:48,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 20:14:48,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 20:14:49,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 20:14:51,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:51,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:52,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:52,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:14:56,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:14:57,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:15:04,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:06,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 20:15:06,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 20:15:06,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:07,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:09,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:11,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:14,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 20:15:14,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:15,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:15:16,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=129840.0, ans=0.0 2023-09-28 20:15:18,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 20:15:19,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 20:15:21,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:22,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:22,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:22,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:24,306 INFO [train.py:1039] (0/4) Epoch 4, batch 3550, loss[loss=0.2878, simple_loss=0.3308, pruned_loss=0.1224, over 23344.00 frames. ], tot_loss[loss=0.2632, simple_loss=0.3174, pruned_loss=0.1045, over 4732385.50 frames. ], batch size: 119, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:15:27,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:15:27,814 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=129906.66666666667, ans=0.125 2023-09-28 20:15:34,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:36,832 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.17 vs. limit=22.5 2023-09-28 20:15:37,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 20:15:38,848 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.65 vs. limit=15.0 2023-09-28 20:15:41,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:43,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:15:46,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:46,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:15:46,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:15:47,074 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=129973.33333333333, ans=0.0 2023-09-28 20:15:51,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:51,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:15:52,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:52,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:15:53,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:15:59,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:15:59,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:16:01,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:01,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:02,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:16:02,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 20:16:02,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:04,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:05,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:16:09,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:11,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:16:13,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:13,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=130106.66666666667, ans=0.2 2023-09-28 20:16:15,458 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.47 vs. limit=22.5 2023-09-28 20:16:16,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 20:16:18,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:16:18,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 20:16:19,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:21,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:16:21,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:16:24,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 20:16:24,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:31,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:32,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 20:16:32,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:36,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:39,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 20:16:39,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=130173.33333333333, ans=0.125 2023-09-28 20:16:42,190 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.286e+02 2.757e+02 3.216e+02 5.394e+02, threshold=5.514e+02, percent-clipped=1.0 2023-09-28 20:16:44,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 20:16:45,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:16:46,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:16:46,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:48,263 INFO [train.py:1039] (0/4) Epoch 4, batch 3600, loss[loss=0.2401, simple_loss=0.3086, pruned_loss=0.08581, over 24509.00 frames. ], tot_loss[loss=0.2634, simple_loss=0.3174, pruned_loss=0.1047, over 4718771.09 frames. ], batch size: 66, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:16:48,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:48,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=130240.0, ans=0.125 2023-09-28 20:16:50,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:16:54,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:56,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:57,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:17:00,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:17:01,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:01,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 20:17:04,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:17:05,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=130306.66666666667, ans=0.5 2023-09-28 20:17:06,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:09,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=130306.66666666667, ans=0.125 2023-09-28 20:17:10,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:13,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:15,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:17:15,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:17:15,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 20:17:17,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:19,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=130373.33333333333, ans=0.2 2023-09-28 20:17:20,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:20,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:17:22,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:24,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:26,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:17:28,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 20:17:28,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=130373.33333333333, ans=0.125 2023-09-28 20:17:35,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:17:35,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:17:36,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 20:17:41,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:17:45,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:47,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:54,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:17:54,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:17:54,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 20:17:57,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 20:17:59,050 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=130506.66666666667, ans=0.0 2023-09-28 20:18:00,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 20:18:02,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:18:02,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:18:02,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=130506.66666666667, ans=0.0 2023-09-28 20:18:03,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 20:18:05,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:05,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:18:05,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:07,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 20:18:07,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 20:18:10,453 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.90 vs. limit=10.0 2023-09-28 20:18:10,877 INFO [train.py:1039] (0/4) Epoch 4, batch 3650, loss[loss=0.2932, simple_loss=0.3384, pruned_loss=0.124, over 23660.00 frames. ], tot_loss[loss=0.2636, simple_loss=0.3179, pruned_loss=0.1047, over 4716499.93 frames. ], batch size: 85, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:18:11,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:18:11,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 20:18:17,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 20:18:18,326 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=7.84 vs. limit=15.0 2023-09-28 20:18:18,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:18:19,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=130573.33333333333, ans=0.125 2023-09-28 20:18:23,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 20:18:24,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 20:18:29,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:18:29,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:18:29,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:18:32,516 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:18:35,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:18:35,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:37,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 20:18:37,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:18:39,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:39,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 20:18:39,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:18:41,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:18:41,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:18:43,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:18:46,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 20:18:47,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 20:18:48,137 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=130706.66666666667, ans=0.0 2023-09-28 20:18:49,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:18:52,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 20:18:53,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:18:53,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:18:58,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:19:00,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:00,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:19:02,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:19:03,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:19:04,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:19:06,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:09,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:09,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:19:11,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:19:13,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:13,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:21,082 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 20:19:22,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:24,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:25,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:19:25,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:27,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:19:28,543 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.316e+02 2.706e+02 3.127e+02 4.745e+02, threshold=5.412e+02, percent-clipped=0.0 2023-09-28 20:19:28,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:30,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 20:19:30,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:33,345 INFO [train.py:1039] (0/4) Epoch 4, batch 3700, loss[loss=0.2736, simple_loss=0.3387, pruned_loss=0.1042, over 24317.00 frames. ], tot_loss[loss=0.2657, simple_loss=0.3197, pruned_loss=0.1059, over 4720458.91 frames. ], batch size: 74, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:19:34,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:19:37,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:39,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:19:41,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=130906.66666666667, ans=10.0 2023-09-28 20:19:43,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:43,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 20:19:43,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:43,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:19:44,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:19:47,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:19:48,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=130973.33333333333, ans=0.125 2023-09-28 20:19:51,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:51,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:53,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:19:53,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:53,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:19:55,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:56,674 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 20:20:04,344 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=131040.0, ans=0.0 2023-09-28 20:20:05,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:20:06,290 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.60 vs. limit=15.0 2023-09-28 20:20:07,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:20:07,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:20:07,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 20:20:08,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:11,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:13,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 20:20:13,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=131040.0, ans=0.2 2023-09-28 20:20:15,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:16,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:20:19,406 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.96 vs. limit=6.0 2023-09-28 20:20:20,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:20,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:20:23,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:20:25,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=131106.66666666666, ans=0.125 2023-09-28 20:20:26,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:26,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 20:20:28,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:20:28,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 20:20:32,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:20:32,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:20:34,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=131106.66666666666, ans=0.1 2023-09-28 20:20:37,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:37,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 20:20:39,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:20:39,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:20:40,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:40,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:44,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:45,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 20:20:45,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 20:20:47,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:20:47,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:20:48,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:20:48,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=131173.33333333334, ans=0.0 2023-09-28 20:20:50,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:20:53,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:55,679 INFO [train.py:1039] (0/4) Epoch 4, batch 3750, loss[loss=0.2814, simple_loss=0.3305, pruned_loss=0.1162, over 23774.00 frames. ], tot_loss[loss=0.2668, simple_loss=0.3208, pruned_loss=0.1064, over 4719931.01 frames. ], batch size: 149, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:20:55,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:20:57,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:20:59,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 20:21:00,696 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.48 vs. limit=15.0 2023-09-28 20:21:01,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:21:04,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:21:04,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 20:21:06,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:21:07,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:12,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:17,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:21:18,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:21:19,010 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=131306.66666666666, ans=0.0 2023-09-28 20:21:20,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:21:22,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:23,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 20:21:23,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:26,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:26,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:29,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 20:21:30,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=131373.33333333334, ans=0.1 2023-09-28 20:21:34,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 20:21:36,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:36,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:39,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:43,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:45,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:21:50,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 20:21:53,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=131440.0, ans=0.125 2023-09-28 20:21:54,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:57,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:59,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:22:03,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:22:06,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:22:06,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:22:10,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:22:11,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:22:13,152 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.509e+02 2.927e+02 3.521e+02 5.743e+02, threshold=5.855e+02, percent-clipped=1.0 2023-09-28 20:22:13,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:22:17,705 INFO [train.py:1039] (0/4) Epoch 4, batch 3800, loss[loss=0.2586, simple_loss=0.3186, pruned_loss=0.09927, over 23951.00 frames. ], tot_loss[loss=0.2663, simple_loss=0.3201, pruned_loss=0.1063, over 4719186.94 frames. ], batch size: 86, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:22:23,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:22:24,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=131573.33333333334, ans=0.0 2023-09-28 20:22:26,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:27,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:22:28,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 20:22:30,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:31,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:33,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:22:35,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:22:35,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:38,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:22:39,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:39,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:22:39,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:42,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 20:22:45,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 20:22:45,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:22:48,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:51,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:22:52,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:22:54,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:22:54,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:57,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:58,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:23:02,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:23:02,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 20:23:05,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:12,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:12,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=131773.33333333334, ans=0.1 2023-09-28 20:23:17,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:23:19,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 20:23:22,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 20:23:24,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:24,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:25,029 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.23 vs. limit=6.0 2023-09-28 20:23:25,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:27,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 20:23:30,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 20:23:30,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 20:23:31,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:32,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=131840.0, ans=0.125 2023-09-28 20:23:33,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:39,630 INFO [train.py:1039] (0/4) Epoch 4, batch 3850, loss[loss=0.2227, simple_loss=0.2824, pruned_loss=0.08148, over 24333.00 frames. ], tot_loss[loss=0.2648, simple_loss=0.3187, pruned_loss=0.1055, over 4718034.06 frames. ], batch size: 56, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:23:39,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:23:39,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:23:45,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:23:45,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 20:23:47,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:23:47,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:52,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:23:55,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:58,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:24:00,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 20:24:05,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:06,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:24:08,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:09,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:24:12,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:14,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:24:15,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:15,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:24:16,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:18,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:19,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:19,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:24:22,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 20:24:22,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 20:24:22,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:22,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:25,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:25,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:25,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 20:24:28,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 20:24:31,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:33,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 20:24:36,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:24:40,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=132106.66666666666, ans=0.125 2023-09-28 20:24:42,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:43,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:48,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:48,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 20:24:50,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=132173.33333333334, ans=0.125 2023-09-28 20:24:52,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 20:24:54,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:56,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:57,999 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.471e+02 2.833e+02 3.573e+02 5.682e+02, threshold=5.667e+02, percent-clipped=0.0 2023-09-28 20:24:59,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:24:59,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:25:01,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:25:01,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 20:25:02,763 INFO [train.py:1039] (0/4) Epoch 4, batch 3900, loss[loss=0.2913, simple_loss=0.3535, pruned_loss=0.1146, over 24567.00 frames. ], tot_loss[loss=0.2632, simple_loss=0.3177, pruned_loss=0.1043, over 4720194.97 frames. ], batch size: 71, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:25:02,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:25:04,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 20:25:04,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:04,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:05,352 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.49 vs. limit=15.0 2023-09-28 20:25:06,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:25:07,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:09,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:25:09,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:09,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:25:09,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:09,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 20:25:09,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:10,167 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.29 vs. limit=15.0 2023-09-28 20:25:13,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:15,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:15,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:25:16,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:19,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:20,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:21,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=132306.66666666666, ans=0.5 2023-09-28 20:25:23,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:25:25,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 20:25:25,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:27,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 20:25:28,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:30,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 20:25:30,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 20:25:34,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=132373.33333333334, ans=0.0 2023-09-28 20:25:37,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:37,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:37,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:25:39,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:25:42,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:44,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:25:46,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:25:46,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:25:48,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:25:54,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:54,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:25:57,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=132440.0, ans=0.125 2023-09-28 20:26:02,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:26:05,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:26:15,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:18,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:18,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 20:26:18,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 20:26:18,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:19,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 20:26:21,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:26:21,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 20:26:24,454 INFO [train.py:1039] (0/4) Epoch 4, batch 3950, loss[loss=0.2937, simple_loss=0.3296, pruned_loss=0.1289, over 22794.00 frames. ], tot_loss[loss=0.2624, simple_loss=0.3172, pruned_loss=0.1038, over 4728670.52 frames. ], batch size: 322, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:26:29,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:26:30,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 20:26:32,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:26:34,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:26:37,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:26:42,380 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 20:26:43,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:43,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 20:26:44,039 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 20:26:45,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:47,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:26:48,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:51,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 20:26:54,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:26:56,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:56,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:26:56,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:26:56,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:27:10,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:27:10,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:27:12,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=132706.66666666666, ans=0.1 2023-09-28 20:27:14,912 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.84 vs. limit=15.0 2023-09-28 20:27:15,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 20:27:17,080 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=132773.33333333334, ans=0.0 2023-09-28 20:27:21,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 20:27:21,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 20:27:22,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:27:24,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:27:31,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:27:31,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:27:32,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:27:32,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:27:32,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=132840.0, ans=0.1 2023-09-28 20:27:34,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 20:27:36,621 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.47 vs. limit=22.5 2023-09-28 20:27:37,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:27:39,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:27:42,848 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.456e+02 2.836e+02 3.414e+02 5.372e+02, threshold=5.673e+02, percent-clipped=0.0 2023-09-28 20:27:42,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 20:27:43,210 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=132840.0, ans=0.0 2023-09-28 20:27:45,958 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.47 vs. limit=22.5 2023-09-28 20:27:48,134 INFO [train.py:1039] (0/4) Epoch 4, batch 4000, loss[loss=0.2594, simple_loss=0.3138, pruned_loss=0.1024, over 23413.00 frames. ], tot_loss[loss=0.2611, simple_loss=0.3165, pruned_loss=0.1028, over 4728282.25 frames. ], batch size: 134, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:27:49,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=132906.66666666666, ans=0.125 2023-09-28 20:27:50,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=132906.66666666666, ans=0.1 2023-09-28 20:27:56,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:04,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:08,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:10,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:28:10,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:10,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 20:28:11,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:28:11,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 20:28:12,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=132973.33333333334, ans=0.125 2023-09-28 20:28:13,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:28:13,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 20:28:13,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=132973.33333333334, ans=0.2 2023-09-28 20:28:14,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:18,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:28:18,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:28:18,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:28:18,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:18,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:28:21,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:28:24,124 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 20:28:24,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:28:24,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:28,893 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 20:28:29,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:28:29,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:37,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 20:28:38,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:40,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:28:41,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 20:28:43,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:28:43,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 20:28:43,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:28:44,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:44,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:28:46,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:28:46,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:28:47,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:50,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 20:28:50,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:53,130 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 20:28:57,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:29:02,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:29:05,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:29:05,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:05,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:29:07,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:09,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.78 vs. limit=15.0 2023-09-28 20:29:10,080 INFO [train.py:1039] (0/4) Epoch 4, batch 4050, loss[loss=0.2758, simple_loss=0.3358, pruned_loss=0.1079, over 24071.00 frames. ], tot_loss[loss=0.2627, simple_loss=0.318, pruned_loss=0.1038, over 4711937.95 frames. ], batch size: 86, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:29:13,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:13,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=133240.0, ans=0.2 2023-09-28 20:29:16,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:29:16,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 20:29:17,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:29:19,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:29:19,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:29:21,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:21,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:24,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=133306.66666666666, ans=0.2 2023-09-28 20:29:25,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:30,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:29:30,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:29:32,080 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-20000.pt 2023-09-28 20:29:35,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:29:35,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:29:38,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:42,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:44,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 20:29:47,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 20:29:47,185 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 20:29:48,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:29:50,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=133373.33333333334, ans=0.1 2023-09-28 20:29:53,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=133373.33333333334, ans=0.125 2023-09-28 20:29:54,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 20:29:55,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:29:59,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:03,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=133440.0, ans=0.125 2023-09-28 20:30:04,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:30:04,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:30:04,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:08,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:30:13,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 20:30:13,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:30:14,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:16,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 20:30:21,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:26,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=133506.66666666666, ans=0.125 2023-09-28 20:30:28,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 20:30:29,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:30:29,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:30:30,994 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.888e+02 2.307e+02 2.673e+02 3.242e+02 5.499e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:30:31,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 20:30:32,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 20:30:32,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:35,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:30:36,465 INFO [train.py:1039] (0/4) Epoch 4, batch 4100, loss[loss=0.2727, simple_loss=0.3179, pruned_loss=0.1138, over 23273.00 frames. ], tot_loss[loss=0.2633, simple_loss=0.3183, pruned_loss=0.1041, over 4714616.18 frames. ], batch size: 105, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:30:36,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:36,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:30:43,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=133573.33333333334, ans=0.0 2023-09-28 20:30:44,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 20:30:46,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 20:30:47,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 20:30:48,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 20:30:49,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:49,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:30:51,766 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 20:30:54,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:30:56,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:30:56,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:56,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:31:02,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:31:03,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:31:03,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:31:05,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 20:31:05,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:05,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:31:05,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:05,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:31:06,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 20:31:10,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=133706.66666666666, ans=0.0 2023-09-28 20:31:11,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:13,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 20:31:14,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:31:17,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:17,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 20:31:18,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:31:18,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:31:20,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:31:21,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 20:31:23,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:31:24,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:31:25,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 20:31:26,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=133773.33333333334, ans=0.95 2023-09-28 20:31:27,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:27,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:31:30,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:30,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=133773.33333333334, ans=0.125 2023-09-28 20:31:34,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:31:35,436 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=15.0 2023-09-28 20:31:35,669 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.01 vs. limit=15.0 2023-09-28 20:31:39,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:39,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:31:45,516 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.33 vs. limit=15.0 2023-09-28 20:31:50,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:31:50,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:53,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:55,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:31:58,068 INFO [train.py:1039] (0/4) Epoch 4, batch 4150, loss[loss=0.24, simple_loss=0.3113, pruned_loss=0.0843, over 24459.00 frames. ], tot_loss[loss=0.2631, simple_loss=0.3183, pruned_loss=0.104, over 4714964.12 frames. ], batch size: 69, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:31:59,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:32:00,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:32:01,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:32:01,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:06,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 20:32:07,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:07,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 20:32:09,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 20:32:09,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 20:32:11,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:15,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:32:15,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:17,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=133973.33333333334, ans=0.2 2023-09-28 20:32:21,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:22,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:32:22,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:32:26,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:32:26,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:28,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:32:31,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:34,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:35,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 20:32:39,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 20:32:39,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:32:39,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 20:32:39,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:32:39,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:32:42,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:32:43,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:48,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 20:32:51,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:32:52,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=134106.66666666666, ans=0.2 2023-09-28 20:32:54,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:32:56,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 20:32:56,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:58,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 20:32:59,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:33:01,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:33:02,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:02,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 20:33:02,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:02,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:33:04,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:33:06,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 20:33:06,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:06,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:33:06,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:33:07,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 20:33:08,343 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.12 vs. limit=15.0 2023-09-28 20:33:09,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:33:09,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:33:11,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:33:12,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:13,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 20:33:13,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:33:16,747 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.811e+02 2.342e+02 2.661e+02 3.100e+02 4.687e+02, threshold=5.322e+02, percent-clipped=0.0 2023-09-28 20:33:19,931 INFO [train.py:1039] (0/4) Epoch 4, batch 4200, loss[loss=0.2617, simple_loss=0.2867, pruned_loss=0.1183, over 23502.00 frames. ], tot_loss[loss=0.2619, simple_loss=0.317, pruned_loss=0.1034, over 4712765.48 frames. ], batch size: 285, lr: 2.32e-02, grad_scale: 16.0 2023-09-28 20:33:19,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:33:21,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 20:33:23,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:33:24,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:26,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:33:27,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:27,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:30,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 20:33:35,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 20:33:35,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:38,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:40,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:33:42,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=134306.66666666666, ans=0.125 2023-09-28 20:33:42,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=134306.66666666666, ans=0.125 2023-09-28 20:33:43,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:33:45,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:33:45,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:47,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 20:33:47,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:48,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:48,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:33:50,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:33:53,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 20:33:53,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:57,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:33:59,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:34:03,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:34:03,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:05,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:34:05,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 20:34:05,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:07,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:34:08,051 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.52 vs. limit=22.5 2023-09-28 20:34:09,205 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=134440.0, ans=10.0 2023-09-28 20:34:13,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:34:14,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:21,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:34:24,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 20:34:24,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=134506.66666666666, ans=0.125 2023-09-28 20:34:24,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=134506.66666666666, ans=0.95 2023-09-28 20:34:27,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:31,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:34:32,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:32,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=134506.66666666666, ans=0.1 2023-09-28 20:34:34,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 20:34:42,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:34:42,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=134573.33333333334, ans=0.125 2023-09-28 20:34:43,562 INFO [train.py:1039] (0/4) Epoch 4, batch 4250, loss[loss=0.2614, simple_loss=0.3107, pruned_loss=0.106, over 23880.00 frames. ], tot_loss[loss=0.2609, simple_loss=0.3152, pruned_loss=0.1032, over 4717052.22 frames. ], batch size: 195, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:34:45,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:45,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:34:48,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:48,441 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=134573.33333333334, ans=0.125 2023-09-28 20:34:53,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:34:53,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 20:34:53,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:56,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:59,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:03,360 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.96 vs. limit=15.0 2023-09-28 20:35:04,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:06,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:08,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:35:08,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:10,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:12,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:13,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:15,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:35:18,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:19,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 20:35:22,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 20:35:22,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:23,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=134706.66666666666, ans=0.1 2023-09-28 20:35:24,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:24,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:26,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:35:26,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:26,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:27,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=134706.66666666666, ans=0.0 2023-09-28 20:35:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:35:32,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:35:35,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:35:37,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:37,597 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=134773.33333333334, ans=0.125 2023-09-28 20:35:38,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 20:35:38,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:35:39,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=134773.33333333334, ans=0.125 2023-09-28 20:35:40,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 20:35:42,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:35:44,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:35:44,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=134773.33333333334, ans=0.95 2023-09-28 20:35:46,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:46,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:48,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 20:35:49,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:35:51,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:35:54,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:55,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:57,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=134840.0, ans=0.125 2023-09-28 20:35:58,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:36:00,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:02,334 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.342e+02 2.586e+02 3.220e+02 5.035e+02, threshold=5.173e+02, percent-clipped=0.0 2023-09-28 20:36:02,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:02,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:36:03,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:04,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 20:36:05,376 INFO [train.py:1039] (0/4) Epoch 4, batch 4300, loss[loss=0.2734, simple_loss=0.3418, pruned_loss=0.1024, over 24538.00 frames. ], tot_loss[loss=0.2605, simple_loss=0.3152, pruned_loss=0.1029, over 4718881.38 frames. ], batch size: 71, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:36:05,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:11,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:11,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:14,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:23,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:36:23,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 20:36:26,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:36:26,641 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=134973.33333333334, ans=0.1 2023-09-28 20:36:27,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:36:27,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:36:27,874 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 20:36:31,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:36:32,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:36:34,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=134973.33333333334, ans=0.125 2023-09-28 20:36:36,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=135040.0, ans=0.125 2023-09-28 20:36:37,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 20:36:37,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:36:39,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 20:36:40,557 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.13 vs. limit=15.0 2023-09-28 20:36:41,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:36:41,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=135040.0, ans=0.1 2023-09-28 20:36:42,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:36:44,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:36:44,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:45,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:36:47,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:49,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:49,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 20:36:49,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 20:36:53,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:56,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:56,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:36:56,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:57,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:57,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 20:36:57,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 20:36:58,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 20:36:59,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:36:59,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 20:36:59,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 20:37:05,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:07,164 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 20:37:07,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:37:08,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:08,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:11,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 20:37:11,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:37:11,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:12,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:12,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:14,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:37:16,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:37:18,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:20,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:20,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:28,121 INFO [train.py:1039] (0/4) Epoch 4, batch 4350, loss[loss=0.2698, simple_loss=0.3185, pruned_loss=0.1106, over 23751.00 frames. ], tot_loss[loss=0.2606, simple_loss=0.3156, pruned_loss=0.1028, over 4717458.48 frames. ], batch size: 149, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:37:28,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 20:37:28,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:37:34,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:37:37,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:38,264 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.15 vs. limit=22.5 2023-09-28 20:37:39,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:37:39,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:37:42,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=135306.66666666666, ans=0.125 2023-09-28 20:37:45,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:37:47,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:50,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:37:50,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:53,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:37:55,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:37:58,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:38:04,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 20:38:05,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:06,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:12,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:13,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 20:38:16,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:18,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:38:24,855 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 20:38:26,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:26,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:38:27,953 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 20:38:29,406 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 20:38:29,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:29,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:29,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:38:29,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:31,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:31,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:38:35,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 20:38:35,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:35,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:35,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:37,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 20:38:39,151 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 20:38:39,169 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 20:38:39,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 20:38:42,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:38:42,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:38:43,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:38:43,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:38:46,861 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.300e+02 2.585e+02 3.110e+02 4.848e+02, threshold=5.170e+02, percent-clipped=0.0 2023-09-28 20:38:47,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 20:38:47,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=135506.66666666666, ans=0.0 2023-09-28 20:38:48,656 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 20:38:48,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:50,006 INFO [train.py:1039] (0/4) Epoch 4, batch 4400, loss[loss=0.27, simple_loss=0.3363, pruned_loss=0.1019, over 23997.00 frames. ], tot_loss[loss=0.262, simple_loss=0.317, pruned_loss=0.1035, over 4725793.77 frames. ], batch size: 80, lr: 2.31e-02, grad_scale: 32.0 2023-09-28 20:38:53,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:38:53,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:55,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=135573.33333333334, ans=0.0 2023-09-28 20:38:56,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:58,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=135573.33333333334, ans=0.125 2023-09-28 20:38:59,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 20:38:59,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 20:38:59,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 20:38:59,986 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 20:39:01,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:39:01,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:39:03,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 20:39:04,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:06,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:06,862 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 20:39:11,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:11,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 20:39:12,029 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 20:39:15,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 20:39:15,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 20:39:15,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 20:39:15,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:17,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:17,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:19,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:19,738 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.87 vs. limit=15.0 2023-09-28 20:39:20,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 20:39:20,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 20:39:22,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:25,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:39:25,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:26,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:28,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:28,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 20:39:29,628 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 20:39:33,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:39,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:41,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 20:39:45,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:39:50,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:39:50,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=135773.33333333334, ans=15.0 2023-09-28 20:39:51,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:39:51,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 20:39:51,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:39:51,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:39:51,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:39:53,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:39:58,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 20:40:01,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 20:40:02,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 20:40:02,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:02,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 20:40:04,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:40:05,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:40:08,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 20:40:11,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:40:13,240 INFO [train.py:1039] (0/4) Epoch 4, batch 4450, loss[loss=0.2865, simple_loss=0.3265, pruned_loss=0.1233, over 23643.00 frames. ], tot_loss[loss=0.2613, simple_loss=0.3173, pruned_loss=0.1027, over 4735662.16 frames. ], batch size: 232, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:40:16,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:16,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:40:26,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:40:26,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:40:31,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:31,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:40:33,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:40:33,357 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:40:35,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:36,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 20:40:36,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:38,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:38,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:40:38,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:40:39,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:40:43,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=135973.33333333334, ans=0.125 2023-09-28 20:40:43,914 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.89 vs. limit=15.0 2023-09-28 20:40:44,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:47,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:49,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:40:55,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:40:57,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 20:40:57,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 20:40:57,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:41:01,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:03,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 20:41:06,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:41:09,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:09,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 20:41:09,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:09,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:09,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:41:09,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:12,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:15,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:41:17,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 20:41:19,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:41:20,145 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.10 vs. limit=12.0 2023-09-28 20:41:20,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:41:21,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=136173.33333333334, ans=0.2 2023-09-28 20:41:22,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:24,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:24,853 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=136173.33333333334, ans=0.0 2023-09-28 20:41:25,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:41:26,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:41:31,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 20:41:32,908 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.347e+02 2.673e+02 3.318e+02 4.703e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:41:33,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:41:35,965 INFO [train.py:1039] (0/4) Epoch 4, batch 4500, loss[loss=0.2648, simple_loss=0.3319, pruned_loss=0.09891, over 24631.00 frames. ], tot_loss[loss=0.2623, simple_loss=0.3178, pruned_loss=0.1034, over 4704144.66 frames. ], batch size: 68, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:41:39,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:40,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 20:41:40,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 20:41:42,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:41:45,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:47,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:47,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:41:48,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:41:48,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:41:48,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:41:50,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=136306.66666666666, ans=0.125 2023-09-28 20:41:54,444 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=136306.66666666666, ans=0.0 2023-09-28 20:42:02,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:42:04,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:42:07,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:08,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:42:08,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:42:15,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:42:16,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=136373.33333333334, ans=0.2 2023-09-28 20:42:20,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:42:24,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:42:27,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:42:27,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 20:42:29,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:29,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:34,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:42:34,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 20:42:34,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:42:34,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:39,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:42:40,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:42:43,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:44,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:42:46,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:42:47,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 20:42:50,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 20:42:50,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 20:42:55,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 20:42:56,790 INFO [train.py:1039] (0/4) Epoch 4, batch 4550, loss[loss=0.2396, simple_loss=0.3082, pruned_loss=0.08555, over 24486.00 frames. ], tot_loss[loss=0.2614, simple_loss=0.3165, pruned_loss=0.1031, over 4706163.32 frames. ], batch size: 66, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:42:56,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 20:42:59,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:03,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:04,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:05,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=136573.33333333334, ans=0.2 2023-09-28 20:43:09,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:12,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:43:14,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:43:16,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:16,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:43:16,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:20,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:20,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:23,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:43:23,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=136640.0, ans=0.125 2023-09-28 20:43:26,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 20:43:28,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 20:43:29,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:43:29,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 20:43:32,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 20:43:34,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:37,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 20:43:39,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:43:44,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:43:46,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 20:43:49,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:43:51,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:51,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:52,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:53,599 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.78 vs. limit=6.0 2023-09-28 20:43:54,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 20:43:55,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 20:43:56,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:43:57,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 20:43:59,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=136773.33333333334, ans=0.1 2023-09-28 20:43:59,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=136773.33333333334, ans=0.0 2023-09-28 20:44:00,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 20:44:00,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:44:00,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:02,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:02,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:02,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:44:05,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:44:05,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 20:44:06,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:44:06,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:44:08,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 20:44:08,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:44:08,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 20:44:10,142 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=136840.0, ans=0.0 2023-09-28 20:44:11,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:44:11,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:44:12,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=136840.0, ans=0.125 2023-09-28 20:44:15,289 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.285e+02 2.509e+02 2.934e+02 4.311e+02, threshold=5.019e+02, percent-clipped=0.0 2023-09-28 20:44:15,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:44:15,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:17,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:44:19,151 INFO [train.py:1039] (0/4) Epoch 4, batch 4600, loss[loss=0.2377, simple_loss=0.2879, pruned_loss=0.09369, over 23443.00 frames. ], tot_loss[loss=0.259, simple_loss=0.3141, pruned_loss=0.102, over 4701110.70 frames. ], batch size: 134, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:44:19,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:44:20,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:44:22,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:23,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:26,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:44:26,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:44:27,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:28,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 20:44:30,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:44:34,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:44:36,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:37,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:39,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=136973.33333333334, ans=0.1 2023-09-28 20:44:45,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 20:44:47,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:50,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:54,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:44:54,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:58,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 20:44:58,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:44:59,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:44:59,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=137040.0, ans=0.125 2023-09-28 20:45:03,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:03,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:45:05,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:45:09,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 20:45:11,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:45:15,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:15,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=137106.66666666666, ans=0.125 2023-09-28 20:45:16,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:45:21,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:21,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 20:45:21,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:22,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 20:45:22,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:23,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:25,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:25,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:45:27,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:28,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 20:45:28,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 20:45:28,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 20:45:28,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:31,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:31,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:33,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:39,405 INFO [train.py:1039] (0/4) Epoch 4, batch 4650, loss[loss=0.2767, simple_loss=0.332, pruned_loss=0.1107, over 23426.00 frames. ], tot_loss[loss=0.2587, simple_loss=0.3136, pruned_loss=0.1018, over 4703237.04 frames. ], batch size: 93, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:45:42,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:45:42,712 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=137240.0, ans=0.125 2023-09-28 20:45:45,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:45,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:45,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:45:45,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:46,916 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=137240.0, ans=0.125 2023-09-28 20:45:47,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:49,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:49,848 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=137240.0, ans=0.125 2023-09-28 20:45:52,168 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.82 vs. limit=6.0 2023-09-28 20:45:53,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 20:45:57,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:45:59,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 20:45:59,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:59,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 20:46:00,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:46:02,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 20:46:02,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 20:46:02,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:02,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:46:08,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:46:08,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:09,921 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 20:46:13,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:13,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 20:46:16,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:16,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:46:17,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 20:46:19,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=137373.33333333334, ans=0.0 2023-09-28 20:46:20,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:46:22,527 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=137373.33333333334, ans=0.1 2023-09-28 20:46:23,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:46:28,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:32,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:34,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=137440.0, ans=0.0 2023-09-28 20:46:35,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:35,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:37,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:46:38,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 20:46:40,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 20:46:40,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 20:46:40,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 20:46:40,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=137440.0, ans=0.0 2023-09-28 20:46:42,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:46:45,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=137506.66666666666, ans=0.125 2023-09-28 20:46:49,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:46:49,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:46:49,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 20:46:50,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:51,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:51,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:46:53,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:46:57,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:46:57,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:57,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:58,907 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.187e+02 2.531e+02 2.951e+02 4.992e+02, threshold=5.061e+02, percent-clipped=0.0 2023-09-28 20:47:00,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:02,487 INFO [train.py:1039] (0/4) Epoch 4, batch 4700, loss[loss=0.2744, simple_loss=0.3175, pruned_loss=0.1157, over 23879.00 frames. ], tot_loss[loss=0.2603, simple_loss=0.3152, pruned_loss=0.1027, over 4702853.86 frames. ], batch size: 195, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:47:02,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:47:02,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:47:02,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 20:47:04,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:47:06,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 20:47:14,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:14,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:16,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:47:17,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:19,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:47:24,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=137640.0, ans=0.0 2023-09-28 20:47:25,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 20:47:25,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 20:47:27,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:28,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:47:28,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:47:33,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:39,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:47:40,682 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.46 vs. limit=15.0 2023-09-28 20:47:42,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:47:45,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:51,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 20:47:52,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:47:54,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:47:57,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 20:47:58,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:47:59,228 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=137773.33333333334, ans=0.125 2023-09-28 20:48:03,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:48:05,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 20:48:07,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:07,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:10,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:48:11,680 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.34 vs. limit=12.0 2023-09-28 20:48:12,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:48:12,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 20:48:12,258 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 20:48:15,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:17,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 20:48:19,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:22,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 20:48:25,430 INFO [train.py:1039] (0/4) Epoch 4, batch 4750, loss[loss=0.2629, simple_loss=0.3079, pruned_loss=0.109, over 22729.00 frames. ], tot_loss[loss=0.2615, simple_loss=0.3166, pruned_loss=0.1032, over 4707038.47 frames. ], batch size: 322, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:48:25,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:48:27,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:48:33,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 20:48:33,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:48:36,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 20:48:38,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:48:38,802 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=14.08 vs. limit=15.0 2023-09-28 20:48:40,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:40,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:43,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=137973.33333333334, ans=0.025 2023-09-28 20:48:44,085 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.47 vs. limit=15.0 2023-09-28 20:48:46,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 20:48:46,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=137973.33333333334, ans=0.125 2023-09-28 20:48:51,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:48:54,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 20:48:54,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:59,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:59,451 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 20:48:59,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 20:49:05,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 20:49:08,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:10,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:11,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:49:11,781 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 20:49:11,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:12,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=138106.66666666666, ans=0.1 2023-09-28 20:49:12,673 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.54 vs. limit=12.0 2023-09-28 20:49:15,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:49:18,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:49:18,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 20:49:20,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 20:49:20,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:49:21,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:49:21,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:22,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:49:24,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 20:49:27,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 20:49:29,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:49:32,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:49:32,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 20:49:33,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:49:33,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:35,737 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=138173.33333333334, ans=0.025 2023-09-28 20:49:36,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:49:37,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:38,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:49:43,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:43,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 20:49:44,415 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 2.351e+02 2.944e+02 3.482e+02 5.215e+02, threshold=5.888e+02, percent-clipped=1.0 2023-09-28 20:49:44,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 20:49:46,109 INFO [train.py:1039] (0/4) Epoch 4, batch 4800, loss[loss=0.2949, simple_loss=0.3368, pruned_loss=0.1265, over 23769.00 frames. ], tot_loss[loss=0.2626, simple_loss=0.3179, pruned_loss=0.1036, over 4711427.67 frames. ], batch size: 232, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:49:46,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 20:49:47,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:49:49,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:51,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 20:49:56,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:58,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:03,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=138306.66666666666, ans=0.1 2023-09-28 20:50:05,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:50:05,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:05,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:06,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 20:50:08,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:50:08,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:50:09,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:50:13,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:14,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:16,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:50:19,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:19,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:50:19,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:19,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:22,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:25,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:50:27,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:50:29,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:33,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 20:50:33,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 20:50:34,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:34,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:50:35,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:50:35,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:35,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:50:36,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:50:37,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:41,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:44,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:47,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:50:50,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 20:50:50,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:52,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:52,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:50:53,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:58,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:58,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:50:58,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:58,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:50:59,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=138506.66666666666, ans=0.2 2023-09-28 20:51:00,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:51:00,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:51:04,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:04,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:04,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:51:06,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 20:51:08,276 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=13.96 vs. limit=15.0 2023-09-28 20:51:09,197 INFO [train.py:1039] (0/4) Epoch 4, batch 4850, loss[loss=0.3395, simple_loss=0.3631, pruned_loss=0.158, over 19366.00 frames. ], tot_loss[loss=0.2633, simple_loss=0.3186, pruned_loss=0.104, over 4710745.50 frames. ], batch size: 388, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:51:09,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 20:51:09,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:09,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:10,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:10,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:12,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=138573.33333333334, ans=0.125 2023-09-28 20:51:14,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:51:21,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 20:51:23,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:23,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=138640.0, ans=0.125 2023-09-28 20:51:26,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:28,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:51:28,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:31,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:33,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=138640.0, ans=0.125 2023-09-28 20:51:34,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:51:36,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:51:36,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 20:51:39,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=138640.0, ans=0.125 2023-09-28 20:51:41,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:43,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:51:43,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:51:45,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:51:45,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 20:51:48,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:48,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:53,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:53,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 20:51:54,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 20:51:54,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:52:02,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:52:02,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 20:52:02,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:52:03,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=138773.33333333334, ans=0.125 2023-09-28 20:52:04,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:52:06,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:52:08,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 20:52:08,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:09,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 20:52:09,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:09,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:11,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 20:52:15,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=138840.0, ans=0.2 2023-09-28 20:52:20,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:25,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=138840.0, ans=0.025 2023-09-28 20:52:27,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:52:27,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:29,910 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.911e+02 2.282e+02 2.559e+02 2.982e+02 4.179e+02, threshold=5.119e+02, percent-clipped=0.0 2023-09-28 20:52:31,428 INFO [train.py:1039] (0/4) Epoch 4, batch 4900, loss[loss=0.24, simple_loss=0.3164, pruned_loss=0.08174, over 24333.00 frames. ], tot_loss[loss=0.2622, simple_loss=0.3173, pruned_loss=0.1035, over 4721233.58 frames. ], batch size: 74, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:52:33,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 20:52:33,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:52:36,968 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=138906.66666666666, ans=0.1 2023-09-28 20:52:38,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=138906.66666666666, ans=0.125 2023-09-28 20:52:41,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:42,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:42,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:52:45,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 20:52:50,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 20:52:54,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 20:52:56,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 20:52:56,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:52:56,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:56,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:52:58,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:58,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:52:58,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 20:53:01,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 20:53:02,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:53:04,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:53:06,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:53:07,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:53:08,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:09,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:09,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 20:53:11,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:53:12,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:53:13,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 20:53:13,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 20:53:13,646 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=139040.0, ans=0.2 2023-09-28 20:53:17,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 20:53:19,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:53:22,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=139106.66666666666, ans=0.125 2023-09-28 20:53:23,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:53:23,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:53:23,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:23,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:53:25,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:53:25,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 20:53:28,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:29,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:53:30,486 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.53 vs. limit=22.5 2023-09-28 20:53:31,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:53:35,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 20:53:35,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:53:35,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:53:36,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 20:53:38,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=139173.33333333334, ans=0.125 2023-09-28 20:53:42,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:44,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:53:46,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 20:53:46,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:53:47,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:53:49,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:51,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=139173.33333333334, ans=0.125 2023-09-28 20:53:52,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:53:52,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:53:52,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:54,526 INFO [train.py:1039] (0/4) Epoch 4, batch 4950, loss[loss=0.2597, simple_loss=0.3199, pruned_loss=0.09979, over 24464.00 frames. ], tot_loss[loss=0.2599, simple_loss=0.315, pruned_loss=0.1024, over 4731131.17 frames. ], batch size: 63, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:53:54,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:53:56,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:53:59,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:53:59,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:53:59,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=139240.0, ans=0.125 2023-09-28 20:54:02,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 20:54:02,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 20:54:02,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:54:06,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 20:54:06,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:06,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:54:07,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:54:07,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:09,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:11,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:54:12,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:54:12,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=139306.66666666666, ans=0.0 2023-09-28 20:54:14,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:54:15,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:15,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:54:18,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:54:25,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:26,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:54:28,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:28,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:32,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:54:33,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 20:54:35,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 20:54:35,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:35,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=139373.33333333334, ans=0.125 2023-09-28 20:54:37,747 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-09-28 20:54:39,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:54:39,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:54:42,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:54:42,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:54:43,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:54:45,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:46,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:54:48,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:54:50,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:51,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:51,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 20:54:53,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:54:53,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:54:55,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=139440.0, ans=0.125 2023-09-28 20:54:57,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:54:58,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:54:58,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:55:00,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:00,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:55:00,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:55:03,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:55:04,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:55:05,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:55:05,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 20:55:05,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=139506.66666666666, ans=10.0 2023-09-28 20:55:10,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:15,179 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.343e+02 2.664e+02 3.186e+02 5.232e+02, threshold=5.328e+02, percent-clipped=1.0 2023-09-28 20:55:16,769 INFO [train.py:1039] (0/4) Epoch 4, batch 5000, loss[loss=0.2353, simple_loss=0.2907, pruned_loss=0.08995, over 20549.00 frames. ], tot_loss[loss=0.2596, simple_loss=0.3145, pruned_loss=0.1023, over 4720527.63 frames. ], batch size: 45, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:55:16,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 20:55:16,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:55:21,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:21,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:22,787 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.86 vs. limit=15.0 2023-09-28 20:55:23,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 20:55:24,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 20:55:26,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:55:28,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 20:55:28,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:55:28,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:55:30,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 20:55:30,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:31,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:55:33,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 20:55:33,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:33,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:55:35,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 20:55:36,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 20:55:36,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:55:36,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 20:55:36,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:55:39,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:39,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:55:39,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 20:55:39,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 20:55:40,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 20:55:40,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:42,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:42,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 20:55:43,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:45,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:45,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:48,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:55:49,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 20:55:51,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:55:54,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:55:57,615 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 20:55:59,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:56:02,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:56:02,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:04,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 20:56:05,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:56:06,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:06,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:07,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:56:08,567 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.65 vs. limit=15.0 2023-09-28 20:56:09,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:13,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:14,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:20,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 20:56:24,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:25,560 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.82 vs. limit=15.0 2023-09-28 20:56:32,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:34,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:34,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:56:34,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:36,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:56:36,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:56:37,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:39,001 INFO [train.py:1039] (0/4) Epoch 4, batch 5050, loss[loss=0.2646, simple_loss=0.3301, pruned_loss=0.09958, over 24646.00 frames. ], tot_loss[loss=0.2606, simple_loss=0.3155, pruned_loss=0.1028, over 4706401.56 frames. ], batch size: 73, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:56:42,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:42,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 20:56:44,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:56:47,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:48,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=139906.66666666666, ans=0.0 2023-09-28 20:56:49,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:56:51,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 20:56:53,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:53,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:54,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:56:56,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:56:57,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:57:05,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 20:57:05,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:57:07,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:07,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 20:57:07,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:10,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:10,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:57:10,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:57:10,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 20:57:12,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 20:57:14,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:19,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:22,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:22,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 20:57:24,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:27,266 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=140040.0, ans=0.2 2023-09-28 20:57:28,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 20:57:28,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=140106.66666666666, ans=0.125 2023-09-28 20:57:29,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:57:30,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:57:30,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:30,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:31,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:57:31,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=140106.66666666666, ans=0.95 2023-09-28 20:57:34,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:57:34,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:34,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:57:34,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:57:36,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 20:57:36,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:57:39,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:43,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:43,860 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 20:57:43,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:57:44,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=140173.33333333334, ans=0.07 2023-09-28 20:57:45,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:57:45,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:45,598 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 20:57:49,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:49,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 20:57:49,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:54,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 20:57:56,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 20:57:59,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:00,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:00,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:58:02,021 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.333e+02 2.668e+02 3.236e+02 5.838e+02, threshold=5.336e+02, percent-clipped=1.0 2023-09-28 20:58:03,586 INFO [train.py:1039] (0/4) Epoch 4, batch 5100, loss[loss=0.2736, simple_loss=0.3175, pruned_loss=0.1148, over 23345.00 frames. ], tot_loss[loss=0.2607, simple_loss=0.3159, pruned_loss=0.1027, over 4715990.59 frames. ], batch size: 285, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:58:05,149 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 20:58:06,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=140240.0, ans=0.125 2023-09-28 20:58:08,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:58:11,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 20:58:11,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 20:58:12,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:14,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:58:15,105 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.60 vs. limit=22.5 2023-09-28 20:58:18,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:58:18,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 20:58:20,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 20:58:25,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:58:25,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:58:28,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:34,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 20:58:34,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:36,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:58:36,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:58:37,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:40,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:40,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 20:58:41,467 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 20:58:41,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:42,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 20:58:42,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 20:58:46,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:55,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:58:58,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 20:58:58,145 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 20:58:58,168 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 20:59:01,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 20:59:01,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:59:06,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 20:59:10,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 20:59:12,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:59:14,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:59:16,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 20:59:17,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:59:19,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 20:59:24,971 INFO [train.py:1039] (0/4) Epoch 4, batch 5150, loss[loss=0.3257, simple_loss=0.3495, pruned_loss=0.1509, over 19418.00 frames. ], tot_loss[loss=0.2622, simple_loss=0.3168, pruned_loss=0.1038, over 4716388.80 frames. ], batch size: 389, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:59:25,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:59:25,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:59:25,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:59:26,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:59:26,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:59:28,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:59:28,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 20:59:28,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 20:59:29,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 20:59:29,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:59:29,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 20:59:31,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:32,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:59:34,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:36,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:36,687 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:59:41,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:59:41,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 20:59:43,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:44,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:59:45,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:59:45,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:59:46,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:59:47,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:59:47,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:59:47,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 20:59:49,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:59:50,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:59:50,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=140640.0, ans=0.125 2023-09-28 20:59:52,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:59:53,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 20:59:55,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:00:01,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:00:01,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=140706.66666666666, ans=0.125 2023-09-28 21:00:04,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 21:00:07,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:00:13,407 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=140773.33333333334, ans=0.125 2023-09-28 21:00:14,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:18,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:23,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:23,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:25,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=140773.33333333334, ans=0.95 2023-09-28 21:00:27,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 21:00:31,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:00:31,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:00:33,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:00:36,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:37,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:37,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 21:00:42,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:42,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:00:45,324 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.271e+02 2.546e+02 2.924e+02 4.595e+02, threshold=5.092e+02, percent-clipped=0.0 2023-09-28 21:00:45,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:45,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:00:47,429 INFO [train.py:1039] (0/4) Epoch 4, batch 5200, loss[loss=0.3548, simple_loss=0.3819, pruned_loss=0.1639, over 19675.00 frames. ], tot_loss[loss=0.2625, simple_loss=0.3176, pruned_loss=0.1037, over 4723977.57 frames. ], batch size: 388, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:00:47,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:00:47,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:00:49,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:00:49,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:00:50,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:00:54,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:00:57,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:00:59,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=140906.66666666666, ans=0.125 2023-09-28 21:01:01,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=140906.66666666666, ans=0.125 2023-09-28 21:01:01,317 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=140906.66666666666, ans=0.125 2023-09-28 21:01:02,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 21:01:02,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:01:02,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:03,423 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.77 vs. limit=15.0 2023-09-28 21:01:05,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:07,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:01:07,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:08,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 21:01:11,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:01:12,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:12,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=140973.33333333334, ans=0.2 2023-09-28 21:01:15,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 21:01:16,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:01:18,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:01:19,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 21:01:21,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 21:01:23,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 21:01:23,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:23,501 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 21:01:24,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:25,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:25,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:01:27,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 21:01:28,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:01:30,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:33,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 21:01:33,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 21:01:35,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 21:01:38,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 21:01:38,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:01:43,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:01:43,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:01:45,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 21:01:45,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:46,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:01:46,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:46,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:01:47,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=141106.66666666666, ans=22.5 2023-09-28 21:01:50,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:01:53,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:01:55,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:56,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:01:56,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:03,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:04,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 21:02:05,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:02:05,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:02:08,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:09,665 INFO [train.py:1039] (0/4) Epoch 4, batch 5250, loss[loss=0.2643, simple_loss=0.3257, pruned_loss=0.1015, over 23990.00 frames. ], tot_loss[loss=0.2629, simple_loss=0.3168, pruned_loss=0.1045, over 4690614.88 frames. ], batch size: 80, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:02:09,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:02:09,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:02:11,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:02:15,653 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.72 vs. limit=22.5 2023-09-28 21:02:16,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:17,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:02:17,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=141240.0, ans=0.0 2023-09-28 21:02:18,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:02:23,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:25,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:02:28,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:02:30,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:02:32,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 21:02:32,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:34,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:53,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=141373.33333333334, ans=0.2 2023-09-28 21:03:01,205 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=141440.0, ans=0.0 2023-09-28 21:03:21,508 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.854e+02 2.354e+02 2.746e+02 3.335e+02 6.410e+02, threshold=5.493e+02, percent-clipped=2.0 2023-09-28 21:03:22,925 INFO [train.py:1039] (0/4) Epoch 4, batch 5300, loss[loss=0.2541, simple_loss=0.3217, pruned_loss=0.0932, over 24460.00 frames. ], tot_loss[loss=0.2615, simple_loss=0.3155, pruned_loss=0.1037, over 4694728.57 frames. ], batch size: 66, lr: 2.26e-02, grad_scale: 32.0 2023-09-28 21:03:34,369 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:03:36,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=141640.0, ans=0.0 2023-09-28 21:03:38,320 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-4.pt 2023-09-28 21:03:43,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:03:43,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 21:03:43,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 21:03:43,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:44,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:44,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:44,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:44,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:44,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:03:44,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:44,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:03:45,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:03:45,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 21:03:45,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 21:03:45,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 21:03:45,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:03:45,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 21:03:45,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 21:03:45,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:46,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:46,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:46,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:46,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:03:47,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:47,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:47,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:47,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:47,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:47,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:03:47,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:47,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:03:48,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 21:03:48,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:49,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:49,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 21:03:49,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 21:03:49,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:03:49,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:03:49,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 21:03:49,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 21:03:49,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:50,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:03:51,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:51,172 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 21:03:51,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 21:03:51,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:03:51,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:51,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 21:03:51,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 21:03:51,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 21:03:52,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:55,444 INFO [train.py:1039] (0/4) Epoch 5, batch 0, loss[loss=0.2703, simple_loss=0.3217, pruned_loss=0.1095, over 23798.00 frames. ], tot_loss[loss=0.2703, simple_loss=0.3217, pruned_loss=0.1095, over 23798.00 frames. ], batch size: 179, lr: 2.11e-02, grad_scale: 32.0 2023-09-28 21:03:55,445 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 21:04:10,254 INFO [train.py:1071] (0/4) Epoch 5, validation: loss=0.3547, simple_loss=0.3281, pruned_loss=0.1907, over 1125622.00 frames. 2023-09-28 21:04:10,255 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 21:04:10,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 21:04:12,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:04:14,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:04:19,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:19,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:04:20,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:21,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 21:04:23,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 21:04:25,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:26,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:32,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:04:32,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:32,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 21:04:33,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=141720.0, ans=0.0 2023-09-28 21:04:35,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:39,548 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.47 vs. limit=10.0 2023-09-28 21:04:46,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:04:46,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:48,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 21:04:52,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:04:52,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:04:53,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:04:58,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:04:58,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=141853.33333333334, ans=0.1 2023-09-28 21:05:02,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:07,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 21:05:10,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 21:05:10,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:10,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:11,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:05:11,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:05:13,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 21:05:17,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:19,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:23,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:05:27,428 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 21:05:28,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:05:32,008 INFO [train.py:1039] (0/4) Epoch 5, batch 50, loss[loss=0.2564, simple_loss=0.333, pruned_loss=0.08989, over 24305.00 frames. ], tot_loss[loss=0.2606, simple_loss=0.3183, pruned_loss=0.1015, over 1074459.58 frames. ], batch size: 74, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:05:32,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:32,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=141986.66666666666, ans=0.0 2023-09-28 21:05:35,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:36,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 21:05:36,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:05:36,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:05:39,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:42,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:45,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:48,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 21:05:48,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:55,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:05:57,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 21:05:59,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 21:06:02,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:06:02,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:02,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:02,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:05,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:06:05,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:06:05,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:12,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:13,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:14,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:06:15,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 21:06:18,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:06:19,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:06:19,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 21:06:20,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:22,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 21:06:22,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=142186.66666666666, ans=0.0 2023-09-28 21:06:29,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:06:29,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:30,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=142186.66666666666, ans=0.125 2023-09-28 21:06:31,358 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.771e+02 2.197e+02 2.413e+02 2.834e+02 4.473e+02, threshold=4.826e+02, percent-clipped=0.0 2023-09-28 21:06:31,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:33,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:33,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:34,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=142186.66666666666, ans=0.2 2023-09-28 21:06:36,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 21:06:36,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 21:06:37,760 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.26 vs. limit=15.0 2023-09-28 21:06:38,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:39,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:41,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:42,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:42,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 21:06:44,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 21:06:44,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 21:06:47,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:47,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:06:47,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 21:06:48,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 21:06:50,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:50,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:52,032 INFO [train.py:1039] (0/4) Epoch 5, batch 100, loss[loss=0.2851, simple_loss=0.3277, pruned_loss=0.1212, over 23720.00 frames. ], tot_loss[loss=0.2611, simple_loss=0.319, pruned_loss=0.1016, over 1887131.97 frames. ], batch size: 232, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:06:52,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:06:53,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:06:53,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=142320.0, ans=0.125 2023-09-28 21:06:55,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:06:57,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:07:01,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:05,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 21:07:05,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:07:12,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:07:12,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:12,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:07:12,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:07:12,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:12,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=142386.66666666666, ans=0.125 2023-09-28 21:07:15,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 21:07:17,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:07:17,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:17,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:17,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:22,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 21:07:22,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:23,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:23,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:07:24,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=142453.33333333334, ans=10.0 2023-09-28 21:07:26,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:07:29,994 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 21:07:30,030 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 21:07:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:07:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:07:32,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=142453.33333333334, ans=0.0 2023-09-28 21:07:36,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:07:39,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:39,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:44,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=142520.0, ans=0.125 2023-09-28 21:07:47,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:47,456 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 21:07:49,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:07:49,402 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:07:52,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:07:52,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:07:53,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:58,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:01,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:03,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:08:06,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:06,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:09,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:09,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:08:09,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:09,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 21:08:11,188 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 21:08:11,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:11,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:08:12,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:12,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:12,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 21:08:12,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:08:14,818 INFO [train.py:1039] (0/4) Epoch 5, batch 150, loss[loss=0.2765, simple_loss=0.3215, pruned_loss=0.1157, over 23833.00 frames. ], tot_loss[loss=0.2583, simple_loss=0.3159, pruned_loss=0.1003, over 2509973.13 frames. ], batch size: 212, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:08:14,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:08:14,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:15,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:16,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:17,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:08:18,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:08:19,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:23,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:23,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:23,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:26,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:28,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:29,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:08:30,636 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.07 vs. limit=15.0 2023-09-28 21:08:31,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:31,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=142720.0, ans=0.0 2023-09-28 21:08:35,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 21:08:35,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 21:08:35,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 21:08:38,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:08:38,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:08:40,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:08:41,226 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=142720.0, ans=6.0 2023-09-28 21:08:42,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:42,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:42,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:43,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:43,999 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 21:08:47,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:52,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:56,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:08:57,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 21:09:00,137 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.66 vs. limit=22.5 2023-09-28 21:09:00,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:09:01,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:09:02,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:03,151 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.26 vs. limit=10.0 2023-09-28 21:09:05,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:09:07,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:09:08,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:09:08,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:08,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 21:09:11,225 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.71 vs. limit=15.0 2023-09-28 21:09:14,557 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.253e+02 2.610e+02 3.187e+02 7.657e+02, threshold=5.219e+02, percent-clipped=8.0 2023-09-28 21:09:14,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:14,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:14,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:09:14,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:09:18,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:20,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 21:09:22,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:09:23,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:09:25,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:27,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:09:27,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 21:09:29,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:29,068 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 21:09:32,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:36,803 INFO [train.py:1039] (0/4) Epoch 5, batch 200, loss[loss=0.265, simple_loss=0.3136, pruned_loss=0.1083, over 23806.00 frames. ], tot_loss[loss=0.2616, simple_loss=0.3174, pruned_loss=0.1028, over 2978979.43 frames. ], batch size: 195, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:09:36,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:09:38,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:09:39,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=142986.66666666666, ans=0.04949747468305833 2023-09-28 21:09:40,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 21:09:41,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:41,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:42,829 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.23 vs. limit=15.0 2023-09-28 21:09:43,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 21:09:46,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:09:48,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:48,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:53,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:09:53,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:53,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:55,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=143053.33333333334, ans=0.2 2023-09-28 21:10:04,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=143053.33333333334, ans=0.0 2023-09-28 21:10:10,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=143120.0, ans=0.125 2023-09-28 21:10:12,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:10:13,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:10:15,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:10:16,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:10:17,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:10:17,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:10:20,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:22,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:10:22,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:22,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:24,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 21:10:25,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:10:25,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:31,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:10:31,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=143186.66666666666, ans=0.125 2023-09-28 21:10:35,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:43,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:43,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:10:48,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=143253.33333333334, ans=0.125 2023-09-28 21:10:52,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:54,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 21:10:55,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:55,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:10:55,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:57,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:10:59,060 INFO [train.py:1039] (0/4) Epoch 5, batch 250, loss[loss=0.2469, simple_loss=0.3181, pruned_loss=0.08789, over 24423.00 frames. ], tot_loss[loss=0.2605, simple_loss=0.3165, pruned_loss=0.1023, over 3369632.09 frames. ], batch size: 69, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:10:59,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 21:11:00,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:02,077 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 21:11:03,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:07,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:11:07,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:08,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:11:10,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:11:10,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:11,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:11:16,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:11:29,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:31,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:11:32,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:11:37,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:11:39,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:11:39,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:11:39,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:41,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:11:41,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:11:41,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:44,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:11:47,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 21:11:47,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:49,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:11:49,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:11:49,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:11:50,619 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=143520.0, ans=0.0 2023-09-28 21:11:51,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:11:53,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:11:53,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:11:56,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:57,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:11:57,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:11:59,968 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.236e+02 2.772e+02 3.274e+02 8.100e+02, threshold=5.544e+02, percent-clipped=4.0 2023-09-28 21:12:01,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:12:04,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:05,792 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.11 vs. limit=12.0 2023-09-28 21:12:07,344 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=143586.66666666666, ans=0.1 2023-09-28 21:12:08,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:12:11,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=143586.66666666666, ans=0.125 2023-09-28 21:12:16,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:16,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:12:19,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 21:12:21,362 INFO [train.py:1039] (0/4) Epoch 5, batch 300, loss[loss=0.217, simple_loss=0.2863, pruned_loss=0.07387, over 24337.00 frames. ], tot_loss[loss=0.257, simple_loss=0.3138, pruned_loss=0.1001, over 3680259.85 frames. ], batch size: 61, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:12:21,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:12:21,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:12:24,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 21:12:24,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:12:26,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:12:26,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 21:12:30,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:33,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:12:36,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:12:38,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 21:12:39,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:41,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:12:41,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 21:12:41,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:45,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:12:52,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:12:52,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 21:12:54,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=143786.66666666666, ans=0.125 2023-09-28 21:12:55,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 21:12:55,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:58,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:59,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:59,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 21:12:59,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:13:02,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:13:03,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:13:05,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:10,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:13:10,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 21:13:11,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:13:14,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:14,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 21:13:16,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:20,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:13:23,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:13:23,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 21:13:28,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:28,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:13:31,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:31,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:13:33,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 21:13:33,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:13:33,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:34,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 21:13:37,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:38,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:38,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:38,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:40,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:44,689 INFO [train.py:1039] (0/4) Epoch 5, batch 350, loss[loss=0.2672, simple_loss=0.317, pruned_loss=0.1087, over 23334.00 frames. ], tot_loss[loss=0.255, simple_loss=0.3116, pruned_loss=0.09915, over 3914158.91 frames. ], batch size: 105, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:13:44,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:13:44,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 21:13:48,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=143986.66666666666, ans=0.09899494936611666 2023-09-28 21:13:50,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:56,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:59,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:59,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:04,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 21:14:05,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:05,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 21:14:06,670 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.44 vs. limit=22.5 2023-09-28 21:14:08,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:08,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 21:14:10,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:14,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 21:14:15,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:14:17,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:18,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:14:20,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:20,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:21,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:21,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:21,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:14:23,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:14:23,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:30,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:14:30,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:14:32,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:14:32,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:38,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 21:14:38,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:41,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=144186.66666666666, ans=0.125 2023-09-28 21:14:44,463 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 2.143e+02 2.367e+02 2.704e+02 4.411e+02, threshold=4.734e+02, percent-clipped=0.0 2023-09-28 21:14:44,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:44,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:14:44,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:48,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 21:14:48,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:14:50,746 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 21:14:52,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 21:14:52,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:55,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:55,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 21:14:57,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:02,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:15:04,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:04,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:04,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:05,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:07,263 INFO [train.py:1039] (0/4) Epoch 5, batch 400, loss[loss=0.2619, simple_loss=0.3247, pruned_loss=0.09953, over 24034.00 frames. ], tot_loss[loss=0.2547, simple_loss=0.3112, pruned_loss=0.09908, over 4099080.93 frames. ], batch size: 86, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:15:08,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:15:11,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:15:13,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 21:15:13,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:14,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:14,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:15:16,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:18,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:20,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:23,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 21:15:27,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 21:15:27,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:27,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 21:15:28,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:33,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:15:33,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:33,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 21:15:33,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:15:33,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:33,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:35,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:37,236 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 21:15:37,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=144386.66666666666, ans=0.2 2023-09-28 21:15:38,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 21:15:43,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:43,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:45,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 21:15:46,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 21:15:49,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:15:52,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:00,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 21:16:04,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:16:06,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 21:16:08,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:16:09,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:16:09,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 21:16:13,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:16:16,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:16:18,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:16:20,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:21,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 21:16:24,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:16:24,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 21:16:25,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:16:25,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:16:25,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=144586.66666666666, ans=0.125 2023-09-28 21:16:26,278 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.43 vs. limit=12.0 2023-09-28 21:16:27,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=144653.33333333334, ans=0.125 2023-09-28 21:16:28,813 INFO [train.py:1039] (0/4) Epoch 5, batch 450, loss[loss=0.3118, simple_loss=0.3331, pruned_loss=0.1453, over 19466.00 frames. ], tot_loss[loss=0.2546, simple_loss=0.3121, pruned_loss=0.09854, over 4247743.51 frames. ], batch size: 388, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:16:28,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 21:16:32,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:16:32,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:16:34,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:16:36,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 21:16:36,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:16:37,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:16:39,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:16:39,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 21:16:39,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:16:40,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:16:44,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:16:49,978 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=144720.0, ans=0.0 2023-09-28 21:16:53,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:54,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:16:55,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 21:16:56,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 21:16:59,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:17:01,220 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.63 vs. limit=22.5 2023-09-28 21:17:02,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:05,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:09,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:11,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:12,958 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=144786.66666666666, ans=0.125 2023-09-28 21:17:14,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 21:17:15,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 21:17:17,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 21:17:17,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:17:19,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:19,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:17:21,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 21:17:22,911 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 21:17:22,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:25,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:17:26,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:17:29,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=144853.33333333334, ans=0.1 2023-09-28 21:17:30,463 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.848e+02 2.241e+02 2.627e+02 3.194e+02 6.560e+02, threshold=5.254e+02, percent-clipped=4.0 2023-09-28 21:17:30,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:17:30,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:17:30,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=144853.33333333334, ans=0.2 2023-09-28 21:17:32,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:17:32,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 21:17:35,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:36,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:17:36,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:17:38,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 21:17:41,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=144920.0, ans=0.0 2023-09-28 21:17:43,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:17:43,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 21:17:45,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 21:17:47,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:47,597 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=144920.0, ans=0.1 2023-09-28 21:17:51,509 INFO [train.py:1039] (0/4) Epoch 5, batch 500, loss[loss=0.2136, simple_loss=0.2836, pruned_loss=0.07184, over 24450.00 frames. ], tot_loss[loss=0.2549, simple_loss=0.3123, pruned_loss=0.09875, over 4349042.63 frames. ], batch size: 63, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:17:53,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:17:55,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:17:56,553 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.68 vs. limit=8.0 2023-09-28 21:17:56,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:17:57,021 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 21:17:58,822 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=144986.66666666666, ans=0.1 2023-09-28 21:18:01,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:01,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:18:01,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:01,909 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 21:18:03,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 21:18:03,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:06,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:18:11,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 21:18:11,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:18:12,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:18:14,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:14,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:19,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=145053.33333333334, ans=0.125 2023-09-28 21:18:25,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:25,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:18:26,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:18:26,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:27,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 21:18:27,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:18:32,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:18:33,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:18:33,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:18:33,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:33,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 21:18:38,264 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 21:18:39,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:18:41,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:18:46,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 21:18:47,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:18:51,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:18:55,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:58,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:19:00,978 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=145253.33333333334, ans=0.125 2023-09-28 21:19:05,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:06,438 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.585e-02 2023-09-28 21:19:07,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 21:19:09,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:09,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:09,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=145253.33333333334, ans=0.125 2023-09-28 21:19:11,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=145253.33333333334, ans=0.1 2023-09-28 21:19:12,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 21:19:12,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:19:15,000 INFO [train.py:1039] (0/4) Epoch 5, batch 550, loss[loss=0.2726, simple_loss=0.3207, pruned_loss=0.1123, over 23425.00 frames. ], tot_loss[loss=0.2579, simple_loss=0.314, pruned_loss=0.1009, over 4411744.02 frames. ], batch size: 285, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:19:15,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:20,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 21:19:21,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 21:19:23,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:23,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 21:19:23,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:19:23,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:24,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:19:26,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:19:28,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:30,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 21:19:30,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:19:31,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=145386.66666666666, ans=0.2 2023-09-28 21:19:35,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:19:35,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:38,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:19:40,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:44,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 21:19:46,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 21:19:47,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:19:52,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:19:52,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:19:54,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:19:59,029 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=145453.33333333334, ans=0.125 2023-09-28 21:20:00,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:01,006 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 21:20:01,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:20:03,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:20:06,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:20:06,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:20:06,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:20:08,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:09,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 21:20:11,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 21:20:11,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:11,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:20:12,162 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.32 vs. limit=15.0 2023-09-28 21:20:13,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:20:13,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:20:16,497 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.228e+02 2.515e+02 3.038e+02 5.618e+02, threshold=5.030e+02, percent-clipped=1.0 2023-09-28 21:20:16,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:20:16,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:20:19,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:20:21,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:21,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 21:20:22,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:20:24,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:24,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:20:26,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:26,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:20:27,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:20:30,287 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.64 vs. limit=10.0 2023-09-28 21:20:33,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 21:20:38,076 INFO [train.py:1039] (0/4) Epoch 5, batch 600, loss[loss=0.2588, simple_loss=0.3252, pruned_loss=0.09617, over 24357.00 frames. ], tot_loss[loss=0.2587, simple_loss=0.3146, pruned_loss=0.1014, over 4466934.04 frames. ], batch size: 77, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:20:38,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 21:20:39,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:20:39,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:20:41,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:48,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:20:50,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:20:52,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 21:20:55,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:20:55,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:20:58,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:01,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 21:21:02,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:21:08,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 21:21:12,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:21:12,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:13,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:21:13,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=145786.66666666666, ans=0.0 2023-09-28 21:21:19,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:21:19,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:21:19,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:26,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:21:29,758 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:21:31,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:31,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:21:31,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:31,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=145853.33333333334, ans=0.125 2023-09-28 21:21:39,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 21:21:39,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=145853.33333333334, ans=0.1 2023-09-28 21:21:44,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:21:44,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:21:49,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 21:21:49,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:21:51,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 21:21:51,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:21:51,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:21:56,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=145920.0, ans=0.125 2023-09-28 21:21:56,687 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:21:58,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:21:59,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:22:00,848 INFO [train.py:1039] (0/4) Epoch 5, batch 650, loss[loss=0.263, simple_loss=0.3125, pruned_loss=0.1067, over 23773.00 frames. ], tot_loss[loss=0.2582, simple_loss=0.3134, pruned_loss=0.1015, over 4521683.19 frames. ], batch size: 179, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:22:01,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:02,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:22:05,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:07,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 21:22:08,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:22:13,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:22:13,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:17,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:20,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 21:22:24,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:22:25,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:30,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:22:30,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:22:32,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=146053.33333333334, ans=0.125 2023-09-28 21:22:33,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:33,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:34,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:22:36,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:36,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=146120.0, ans=0.125 2023-09-28 21:22:38,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:22:40,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:22:40,986 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 21:22:41,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:41,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:44,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:44,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:46,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:22:46,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:22:47,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 21:22:50,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:22:51,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:53,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:22:53,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:53,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:22:55,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 21:22:56,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 21:22:57,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=146186.66666666666, ans=0.125 2023-09-28 21:22:58,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:58,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:58,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:22:58,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:23:01,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:23:04,874 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.282e+02 2.474e+02 2.887e+02 4.172e+02, threshold=4.947e+02, percent-clipped=0.0 2023-09-28 21:23:06,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:06,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:08,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:23:09,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=146253.33333333334, ans=0.125 2023-09-28 21:23:11,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:11,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:23:12,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:19,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:23:19,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:19,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:19,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:23,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=146320.0, ans=0.125 2023-09-28 21:23:24,860 INFO [train.py:1039] (0/4) Epoch 5, batch 700, loss[loss=0.237, simple_loss=0.29, pruned_loss=0.09206, over 23398.00 frames. ], tot_loss[loss=0.2563, simple_loss=0.3122, pruned_loss=0.1002, over 4567553.17 frames. ], batch size: 134, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:23:27,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 21:23:27,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 21:23:30,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 21:23:30,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=146320.0, ans=0.0 2023-09-28 21:23:30,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=146320.0, ans=0.1 2023-09-28 21:23:31,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:33,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:23:33,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 21:23:39,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:42,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:23:44,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:45,117 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.04 vs. limit=15.0 2023-09-28 21:23:46,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:23:46,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:49,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:50,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=146386.66666666666, ans=0.1 2023-09-28 21:23:52,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:23:52,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:23:55,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 21:24:00,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 21:24:05,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:24:05,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:24:07,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:24:10,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:24:12,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 21:24:15,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:15,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:24:15,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 21:24:21,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:24:23,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:25,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:24:28,817 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=146586.66666666666, ans=0.2 2023-09-28 21:24:30,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:24:31,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 21:24:37,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 21:24:38,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 21:24:38,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:41,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:24:42,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:24:44,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:44,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 21:24:47,184 INFO [train.py:1039] (0/4) Epoch 5, batch 750, loss[loss=0.2427, simple_loss=0.2967, pruned_loss=0.09432, over 23775.00 frames. ], tot_loss[loss=0.2548, simple_loss=0.3114, pruned_loss=0.09912, over 4592521.99 frames. ], batch size: 212, lr: 2.07e-02, grad_scale: 16.0 2023-09-28 21:24:48,282 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=6.91 vs. limit=15.0 2023-09-28 21:24:49,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 21:24:49,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 21:24:49,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 21:24:49,538 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=146653.33333333334, ans=0.125 2023-09-28 21:24:50,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 21:24:50,889 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:24:52,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 21:24:52,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:24:53,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 21:24:55,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:55,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:24:58,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:00,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:00,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=146653.33333333334, ans=0.04949747468305833 2023-09-28 21:25:02,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:25:02,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:04,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:25:05,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:25:08,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:25:12,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:14,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:14,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 21:25:14,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:25:17,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:25:19,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=146786.66666666666, ans=0.125 2023-09-28 21:25:21,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 21:25:21,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:25:23,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 21:25:24,798 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 21:25:24,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 21:25:24,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:25:24,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:25:25,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=146786.66666666666, ans=0.0 2023-09-28 21:25:28,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:25:34,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:25:34,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:34,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:25:37,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:39,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:39,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 21:25:40,430 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.24 vs. limit=15.0 2023-09-28 21:25:40,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:25:42,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 21:25:43,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:25:47,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:25:47,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 21:25:48,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:51,067 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.300e+02 2.781e+02 3.196e+02 5.681e+02, threshold=5.563e+02, percent-clipped=1.0 2023-09-28 21:25:52,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:25:55,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:25:55,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:58,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:26:01,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 21:26:01,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:02,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:07,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:07,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:10,171 INFO [train.py:1039] (0/4) Epoch 5, batch 800, loss[loss=0.2488, simple_loss=0.3203, pruned_loss=0.08866, over 24555.00 frames. ], tot_loss[loss=0.2538, simple_loss=0.3107, pruned_loss=0.09845, over 4628825.72 frames. ], batch size: 71, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:26:10,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:12,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:26:18,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:18,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:21,016 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=146986.66666666666, ans=0.125 2023-09-28 21:26:22,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:22,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:23,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:23,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:24,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:29,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:31,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:26:31,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=147053.33333333334, ans=0.0 2023-09-28 21:26:31,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=147053.33333333334, ans=0.0 2023-09-28 21:26:34,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 21:26:35,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:37,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:37,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:26:37,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:38,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 21:26:38,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:38,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 21:26:42,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:43,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:45,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:47,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:50,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:50,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:56,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:26:56,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:26:56,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 21:26:58,612 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 21:26:58,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 21:26:58,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:27:00,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:01,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:01,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:07,086 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 21:27:07,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 21:27:08,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:27:10,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:27:13,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:27:18,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:18,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 21:27:19,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:27:22,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 21:27:22,736 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.91 vs. limit=15.0 2023-09-28 21:27:30,750 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.45 vs. limit=15.0 2023-09-28 21:27:31,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:32,893 INFO [train.py:1039] (0/4) Epoch 5, batch 850, loss[loss=0.2154, simple_loss=0.277, pruned_loss=0.07685, over 24346.00 frames. ], tot_loss[loss=0.2533, simple_loss=0.3107, pruned_loss=0.09791, over 4648073.25 frames. ], batch size: 56, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:27:33,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:27:34,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 21:27:34,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:27:36,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:39,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 21:27:39,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:40,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:27:42,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:43,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:27:45,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:46,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 21:27:46,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 21:27:46,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 21:27:48,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:48,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:27:50,115 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=147386.66666666666, ans=0.125 2023-09-28 21:27:51,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:51,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:52,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:27:58,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:58,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:58,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 21:28:03,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 21:28:06,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:28:07,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 21:28:11,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 21:28:12,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 21:28:14,759 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 21:28:14,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:14,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:28:14,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:28:18,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 21:28:20,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=147453.33333333334, ans=0.1 2023-09-28 21:28:22,207 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.99 vs. limit=22.5 2023-09-28 21:28:23,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:24,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:24,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:28:24,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:28:26,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:28:28,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:28:28,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 21:28:34,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:28:34,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:35,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:28:36,311 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.721e+02 2.245e+02 2.598e+02 3.142e+02 5.686e+02, threshold=5.195e+02, percent-clipped=1.0 2023-09-28 21:28:36,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:36,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:38,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:40,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:28:40,866 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.60 vs. limit=15.0 2023-09-28 21:28:41,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:28:42,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:28:43,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:28:51,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:28:53,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:53,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 21:28:54,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:28:54,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:56,152 INFO [train.py:1039] (0/4) Epoch 5, batch 900, loss[loss=0.2839, simple_loss=0.3264, pruned_loss=0.1207, over 22741.00 frames. ], tot_loss[loss=0.2541, simple_loss=0.3119, pruned_loss=0.09812, over 4677207.17 frames. ], batch size: 322, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:28:57,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 21:29:05,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:29:06,661 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.75 vs. limit=15.0 2023-09-28 21:29:07,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:07,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 21:29:10,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:29:12,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 21:29:12,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:29:14,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:29:14,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:14,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:29:15,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:29:22,940 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.13 vs. limit=22.5 2023-09-28 21:29:26,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:29:26,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:28,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:29:31,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:32,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=147786.66666666666, ans=0.125 2023-09-28 21:29:37,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 21:29:38,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:29:42,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:29:42,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:29:42,256 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 21:29:43,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 21:29:48,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=147853.33333333334, ans=0.09899494936611666 2023-09-28 21:29:52,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:29:52,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:29:52,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:29:52,736 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=147853.33333333334, ans=0.1 2023-09-28 21:29:59,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:00,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:01,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 21:30:01,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:30:05,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 21:30:08,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:30:08,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:10,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:30:10,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:15,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 21:30:15,068 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 21:30:16,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:30:16,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 21:30:18,138 INFO [train.py:1039] (0/4) Epoch 5, batch 950, loss[loss=0.2488, simple_loss=0.2903, pruned_loss=0.1036, over 22592.00 frames. ], tot_loss[loss=0.2539, simple_loss=0.3117, pruned_loss=0.09807, over 4697506.07 frames. ], batch size: 322, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:30:20,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:25,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 21:30:31,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:33,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:33,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:35,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:30:35,344 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 21:30:38,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:40,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:30:40,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:40,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:30:42,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 21:30:44,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:30:45,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:47,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 21:30:47,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:50,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:50,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:51,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:53,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 21:30:56,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:30:57,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:31:00,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:31:04,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:04,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:31:07,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 21:31:08,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:31:08,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:31:10,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:11,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:11,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:31:17,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 21:31:18,587 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.59 vs. limit=6.0 2023-09-28 21:31:19,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:31:21,894 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.107e+02 2.418e+02 2.816e+02 4.980e+02, threshold=4.836e+02, percent-clipped=0.0 2023-09-28 21:31:22,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:22,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:22,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 21:31:22,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:22,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:31:23,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 21:31:28,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:31:29,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:35,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:31:36,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 21:31:36,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 21:31:40,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=148320.0, ans=0.125 2023-09-28 21:31:41,336 INFO [train.py:1039] (0/4) Epoch 5, batch 1000, loss[loss=0.2796, simple_loss=0.3241, pruned_loss=0.1176, over 23155.00 frames. ], tot_loss[loss=0.2528, simple_loss=0.3105, pruned_loss=0.0975, over 4703853.44 frames. ], batch size: 105, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:31:41,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:43,478 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:31:45,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 21:31:45,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:50,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:31:52,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 21:31:52,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 21:32:00,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:00,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:32:02,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:04,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 21:32:08,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 21:32:11,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 21:32:11,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:12,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 21:32:14,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 21:32:14,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 21:32:14,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:16,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:22,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=148453.33333333334, ans=0.1 2023-09-28 21:32:23,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:25,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:32:26,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:26,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:26,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 21:32:28,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:28,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:32:29,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:29,782 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 21:32:34,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 21:32:34,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 21:32:34,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=148520.0, ans=0.125 2023-09-28 21:32:37,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 21:32:39,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:32:39,949 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=148520.0, ans=0.125 2023-09-28 21:32:40,100 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=148520.0, ans=0.0 2023-09-28 21:32:47,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:47,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:32:47,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:49,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:32:49,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=148586.66666666666, ans=0.95 2023-09-28 21:32:50,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 21:32:53,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:32:53,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 21:32:54,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 21:32:56,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:32:56,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:59,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:33:02,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:33:02,453 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=2.945e-03 2023-09-28 21:33:03,570 INFO [train.py:1039] (0/4) Epoch 5, batch 1050, loss[loss=0.2491, simple_loss=0.323, pruned_loss=0.08764, over 24605.00 frames. ], tot_loss[loss=0.2514, simple_loss=0.3091, pruned_loss=0.09687, over 4713562.69 frames. ], batch size: 71, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:33:04,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:07,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:33:09,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:33:09,621 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.39 vs. limit=22.5 2023-09-28 21:33:10,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:33:12,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:14,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:16,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:33:18,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:33:21,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:33:21,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:33:21,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:33:24,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:33:25,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 21:33:26,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:28,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 21:33:29,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:33:29,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 21:33:29,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:33:34,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:36,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:33:36,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:39,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 21:33:39,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 21:33:39,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:45,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 21:33:48,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 21:33:49,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:53,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:33:55,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:33:55,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:33:56,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:34:01,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:34:04,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 21:34:06,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 21:34:06,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 21:34:07,749 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.205e+02 2.391e+02 2.864e+02 4.460e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-28 21:34:07,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:07,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:34:10,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 21:34:14,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:34:17,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:17,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:17,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:17,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 21:34:23,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:23,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 21:34:23,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 21:34:24,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:34:27,606 INFO [train.py:1039] (0/4) Epoch 5, batch 1100, loss[loss=0.2661, simple_loss=0.3152, pruned_loss=0.1085, over 23484.00 frames. ], tot_loss[loss=0.2502, simple_loss=0.3084, pruned_loss=0.09605, over 4721822.83 frames. ], batch size: 134, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:34:28,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=148986.66666666666, ans=0.125 2023-09-28 21:34:29,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:34:34,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:34:38,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=148986.66666666666, ans=0.0 2023-09-28 21:34:40,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:34:41,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:34:41,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:43,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 21:34:44,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:34:45,427 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.08 vs. limit=15.0 2023-09-28 21:34:46,736 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=149053.33333333334, ans=0.2 2023-09-28 21:34:47,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:34:49,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:34:51,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:34:52,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 21:34:54,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:34:56,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:56,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:58,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:34:59,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:35:05,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:35:06,012 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=149120.0, ans=0.125 2023-09-28 21:35:08,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 21:35:11,132 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 21:35:11,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:12,947 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=149120.0, ans=0.5 2023-09-28 21:35:15,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:15,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:35:17,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:35:17,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 21:35:18,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:35:18,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:35:18,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:35:18,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=149186.66666666666, ans=0.125 2023-09-28 21:35:20,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:20,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 21:35:24,657 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.68 vs. limit=15.0 2023-09-28 21:35:26,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:35:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 21:35:28,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:35:32,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=149253.33333333334, ans=0.5 2023-09-28 21:35:33,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:35:35,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 21:35:35,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:35:37,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:40,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:35:41,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:41,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 21:35:43,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:35:45,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:45,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 21:35:45,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:35:46,368 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.57 vs. limit=6.0 2023-09-28 21:35:47,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 21:35:48,545 INFO [train.py:1039] (0/4) Epoch 5, batch 1150, loss[loss=0.2575, simple_loss=0.3256, pruned_loss=0.09474, over 24386.00 frames. ], tot_loss[loss=0.2509, simple_loss=0.3091, pruned_loss=0.09637, over 4727971.81 frames. ], batch size: 77, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:35:48,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:35:48,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:35:50,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:35:53,538 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=149320.0, ans=0.1 2023-09-28 21:35:56,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:35:57,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:35:58,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=149320.0, ans=0.0 2023-09-28 21:36:00,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:01,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:36:01,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 21:36:01,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:04,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=149386.66666666666, ans=0.95 2023-09-28 21:36:05,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 21:36:05,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:05,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:36:12,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 21:36:14,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:17,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:19,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:19,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 21:36:19,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:36:21,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:24,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 21:36:26,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:29,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:36,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=149520.0, ans=0.125 2023-09-28 21:36:38,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=149520.0, ans=0.0 2023-09-28 21:36:41,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 21:36:47,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:48,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:50,958 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.787e+02 2.166e+02 2.435e+02 2.809e+02 4.003e+02, threshold=4.871e+02, percent-clipped=0.0 2023-09-28 21:36:52,857 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 21:36:53,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=149586.66666666666, ans=0.125 2023-09-28 21:36:54,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:37:02,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=149586.66666666666, ans=0.1 2023-09-28 21:37:04,056 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 21:37:05,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=149586.66666666666, ans=0.0 2023-09-28 21:37:07,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:07,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:37:09,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:37:09,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:37:10,903 INFO [train.py:1039] (0/4) Epoch 5, batch 1200, loss[loss=0.2618, simple_loss=0.3167, pruned_loss=0.1034, over 23285.00 frames. ], tot_loss[loss=0.2519, simple_loss=0.3106, pruned_loss=0.09657, over 4736196.14 frames. ], batch size: 93, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:37:13,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:13,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=149653.33333333334, ans=0.125 2023-09-28 21:37:20,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:37:20,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:37:22,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:22,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:22,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:37:25,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:37:28,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:37:29,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=149720.0, ans=0.05 2023-09-28 21:37:30,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:30,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:37:31,952 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 21:37:35,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 21:37:38,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:37:41,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:37:43,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:47,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:37:47,353 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 21:37:48,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:54,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=149786.66666666666, ans=0.125 2023-09-28 21:37:55,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:37:55,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:37:55,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 21:37:57,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:38:00,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 21:38:04,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 21:38:04,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:38:06,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:38:09,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:09,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:38:11,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:38:11,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:38:12,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:38:13,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 21:38:14,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:38:14,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:14,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:38:18,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:18,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:19,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=149920.0, ans=0.125 2023-09-28 21:38:23,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:38:25,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:38:28,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 21:38:31,823 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 21:38:33,134 INFO [train.py:1039] (0/4) Epoch 5, batch 1250, loss[loss=0.2823, simple_loss=0.329, pruned_loss=0.1179, over 23395.00 frames. ], tot_loss[loss=0.2527, simple_loss=0.3115, pruned_loss=0.09696, over 4741591.62 frames. ], batch size: 119, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:38:34,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:38:36,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:36,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=149986.66666666666, ans=0.2 2023-09-28 21:38:37,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:38:39,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:41,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 21:38:44,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:38:46,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:47,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 21:38:49,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:38:49,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=150053.33333333334, ans=0.0 2023-09-28 21:38:51,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:38:56,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:38:56,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:57,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:38:57,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:00,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:39:03,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 21:39:03,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:03,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:04,168 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.32 vs. limit=10.0 2023-09-28 21:39:04,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:06,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:09,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:12,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:39:16,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 21:39:17,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:39:18,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=150120.0, ans=0.09899494936611666 2023-09-28 21:39:20,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:20,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 21:39:20,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:39:20,934 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 21:39:21,902 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.53 vs. limit=15.0 2023-09-28 21:39:22,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:22,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:26,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:39:31,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 21:39:31,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 21:39:33,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 21:39:36,087 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.883e+02 2.272e+02 2.528e+02 2.863e+02 4.623e+02, threshold=5.057e+02, percent-clipped=0.0 2023-09-28 21:39:36,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:39:37,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 21:39:37,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:39,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:39:40,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:39:41,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 21:39:41,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:39:42,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:39:42,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:39:42,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:46,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 21:39:47,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:50,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:39:52,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:39:53,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:55,810 INFO [train.py:1039] (0/4) Epoch 5, batch 1300, loss[loss=0.2404, simple_loss=0.3126, pruned_loss=0.08412, over 24490.00 frames. ], tot_loss[loss=0.2534, simple_loss=0.3119, pruned_loss=0.09742, over 4731116.32 frames. ], batch size: 66, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:39:57,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:57,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 21:39:57,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=150320.0, ans=0.0 2023-09-28 21:40:03,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:06,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:40:06,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:08,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:40:10,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:40:10,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 21:40:16,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:40:17,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:40:20,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 21:40:22,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:40:26,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:27,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:27,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=150453.33333333334, ans=0.0 2023-09-28 21:40:30,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:30,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:32,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:40:32,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:40:34,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 21:40:34,696 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=150453.33333333334, ans=0.95 2023-09-28 21:40:39,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:40:39,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:40:42,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 21:40:42,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:40:45,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:40:48,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:48,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 21:40:48,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:48,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 21:40:51,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:53,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:53,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:40:58,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 21:40:59,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 21:41:01,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 21:41:03,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=150586.66666666666, ans=0.125 2023-09-28 21:41:06,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:41:09,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 21:41:11,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:18,066 INFO [train.py:1039] (0/4) Epoch 5, batch 1350, loss[loss=0.249, simple_loss=0.2779, pruned_loss=0.1101, over 19521.00 frames. ], tot_loss[loss=0.2531, simple_loss=0.3112, pruned_loss=0.09751, over 4711544.44 frames. ], batch size: 389, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:41:18,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 21:41:21,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:21,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=150653.33333333334, ans=0.125 2023-09-28 21:41:24,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:27,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:27,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:31,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:41:31,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:35,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:38,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 21:41:38,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=150720.0, ans=0.125 2023-09-28 21:41:41,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:41:41,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:41:42,650 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.29 vs. limit=15.0 2023-09-28 21:41:43,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 21:41:43,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:41:46,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:41:46,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 21:41:47,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 21:41:50,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=150786.66666666666, ans=0.125 2023-09-28 21:41:51,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 21:41:52,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:52,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 21:42:00,822 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=150786.66666666666, ans=0.125 2023-09-28 21:42:03,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:15,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:15,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:16,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 21:42:19,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:20,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 21:42:20,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:42:22,127 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.340e+02 2.561e+02 2.889e+02 4.488e+02, threshold=5.123e+02, percent-clipped=0.0 2023-09-28 21:42:22,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:42:25,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:42:27,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 21:42:30,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:42:35,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 21:42:36,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 21:42:40,607 INFO [train.py:1039] (0/4) Epoch 5, batch 1400, loss[loss=0.2356, simple_loss=0.2873, pruned_loss=0.09197, over 23479.00 frames. ], tot_loss[loss=0.2525, simple_loss=0.3101, pruned_loss=0.09746, over 4725871.18 frames. ], batch size: 285, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:42:43,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 21:42:45,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:46,086 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=150986.66666666666, ans=0.125 2023-09-28 21:42:47,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:42:49,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:42:55,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 21:42:55,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=151053.33333333334, ans=0.125 2023-09-28 21:42:56,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 21:43:08,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:43:09,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:11,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:43:11,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:43:11,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=151120.0, ans=0.125 2023-09-28 21:43:16,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:43:16,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:43:25,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=151120.0, ans=0.125 2023-09-28 21:43:27,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:27,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:32,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 21:43:32,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:43:32,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:43:33,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:43:35,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:35,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:43:37,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:43:37,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:43:38,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 21:43:38,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:43:43,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:45,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=151253.33333333334, ans=0.0 2023-09-28 21:43:48,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:43:56,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 21:43:58,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:43:58,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:44:01,389 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.99 vs. limit=22.5 2023-09-28 21:44:01,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:44:02,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:03,467 INFO [train.py:1039] (0/4) Epoch 5, batch 1450, loss[loss=0.228, simple_loss=0.2835, pruned_loss=0.08623, over 23377.00 frames. ], tot_loss[loss=0.2511, simple_loss=0.3088, pruned_loss=0.09666, over 4727146.42 frames. ], batch size: 119, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:44:05,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:44:08,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:44:10,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:44:10,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:10,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:44:12,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.21 vs. limit=15.0 2023-09-28 21:44:14,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:16,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:44:16,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:44:17,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 21:44:19,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:44:19,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 21:44:19,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:20,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:20,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 21:44:24,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:24,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:44:25,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 21:44:26,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:26,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:44:29,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:31,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:34,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:44:34,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:44:36,472 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=151453.33333333334, ans=0.125 2023-09-28 21:44:37,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:37,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:40,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:41,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:44:41,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:41,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:44:44,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 21:44:47,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:49,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=151453.33333333334, ans=0.125 2023-09-28 21:44:52,392 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 21:44:53,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:44:55,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:44:57,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:44:59,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 21:45:04,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:06,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 21:45:07,701 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.862e+02 2.279e+02 2.648e+02 3.024e+02 3.849e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 21:45:07,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 21:45:09,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:11,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:12,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:45:13,430 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.17 vs. limit=15.0 2023-09-28 21:45:14,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 21:45:17,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 21:45:18,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 21:45:19,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:20,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:45:25,526 INFO [train.py:1039] (0/4) Epoch 5, batch 1500, loss[loss=0.2648, simple_loss=0.3192, pruned_loss=0.1052, over 23628.00 frames. ], tot_loss[loss=0.2516, simple_loss=0.3094, pruned_loss=0.09696, over 4726782.70 frames. ], batch size: 256, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:45:31,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 21:45:31,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:45:31,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:45:31,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:33,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:34,442 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.57 vs. limit=10.0 2023-09-28 21:45:35,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:45:37,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 21:45:39,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:45:40,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:45:40,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:42,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:43,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:45:44,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:45,890 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:45:50,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:50,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 21:45:51,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:45:51,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:45:53,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:56,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 21:46:00,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 21:46:01,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:01,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 21:46:04,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:46:06,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:08,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:46:08,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:09,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 21:46:09,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:46:09,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:11,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 21:46:11,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:18,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:46:18,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 21:46:22,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:46:24,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:46:29,575 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 21:46:30,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:30,884 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 21:46:32,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:46:33,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:46:34,059 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 21:46:35,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:46:39,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 21:46:40,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:43,033 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.17 vs. limit=22.5 2023-09-28 21:46:44,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:44,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:44,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:45,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:45,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:46,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=151986.66666666666, ans=0.1 2023-09-28 21:46:47,873 INFO [train.py:1039] (0/4) Epoch 5, batch 1550, loss[loss=0.2554, simple_loss=0.3052, pruned_loss=0.1028, over 23794.00 frames. ], tot_loss[loss=0.2519, simple_loss=0.3097, pruned_loss=0.09706, over 4728716.26 frames. ], batch size: 212, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:46:48,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 21:46:48,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=151986.66666666666, ans=0.125 2023-09-28 21:46:49,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 21:46:49,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:46:51,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 21:46:52,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 21:46:54,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:55,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:56,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:56,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:46:57,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:57,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:47:02,068 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 21:47:02,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:02,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:47:02,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:47:05,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:47:05,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 21:47:07,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:47:08,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 21:47:08,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 21:47:08,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 21:47:08,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:11,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=152053.33333333334, ans=0.1 2023-09-28 21:47:12,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:17,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:47:19,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=152120.0, ans=0.0 2023-09-28 21:47:20,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 21:47:20,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 21:47:22,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=152120.0, ans=0.1 2023-09-28 21:47:27,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:27,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=152120.0, ans=0.125 2023-09-28 21:47:30,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:47:32,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:47:32,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:47:32,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 21:47:32,724 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.64 vs. limit=15.0 2023-09-28 21:47:38,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:47:40,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:42,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:47:45,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:47:47,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:47,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 21:47:47,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:47:50,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:47:50,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:51,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 21:47:52,244 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.346e+02 2.949e+02 3.489e+02 5.626e+02, threshold=5.898e+02, percent-clipped=1.0 2023-09-28 21:47:52,342 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 21:47:54,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:47:59,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 21:48:05,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:06,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:08,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 21:48:09,966 INFO [train.py:1039] (0/4) Epoch 5, batch 1600, loss[loss=0.2246, simple_loss=0.2907, pruned_loss=0.07924, over 24521.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3099, pruned_loss=0.09674, over 4723194.30 frames. ], batch size: 63, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:48:10,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:48:12,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:12,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:48:12,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:48:13,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:48:17,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=152320.0, ans=0.125 2023-09-28 21:48:18,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:19,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 21:48:19,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 21:48:21,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 21:48:25,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:48:26,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 21:48:28,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:48:30,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:48:35,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:48:38,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 21:48:43,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:48:44,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 21:48:44,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:44,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 21:48:49,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 21:48:58,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:59,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 21:49:00,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:49:00,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:00,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:49:01,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 21:49:07,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 21:49:08,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:49:08,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:09,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:10,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:49:11,269 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.86 vs. limit=10.0 2023-09-28 21:49:13,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:49:13,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:49:16,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:49:22,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:23,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:49:27,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 21:49:27,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:49:29,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 21:49:32,702 INFO [train.py:1039] (0/4) Epoch 5, batch 1650, loss[loss=0.2595, simple_loss=0.3266, pruned_loss=0.09624, over 24664.00 frames. ], tot_loss[loss=0.2531, simple_loss=0.3106, pruned_loss=0.09785, over 4723181.43 frames. ], batch size: 73, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:49:34,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:35,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:49:37,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:49:37,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 21:49:37,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 21:49:37,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 21:49:37,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 21:49:37,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=152653.33333333334, ans=0.125 2023-09-28 21:49:43,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:44,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:44,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:49:45,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:49:46,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:50,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 21:49:51,644 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.89 vs. limit=22.5 2023-09-28 21:49:52,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:52,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:52,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:49:52,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:49:53,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 21:49:53,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 21:49:58,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:50:02,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:50:02,813 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.45 vs. limit=15.0 2023-09-28 21:50:10,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 21:50:11,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=152786.66666666666, ans=15.0 2023-09-28 21:50:12,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:16,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 21:50:19,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:22,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:50:22,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:50:22,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:23,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:50:23,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:26,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:27,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:28,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:28,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:30,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:30,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:50:33,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:33,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=152853.33333333334, ans=0.1 2023-09-28 21:50:34,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 21:50:36,172 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.210e+02 2.496e+02 2.822e+02 4.651e+02, threshold=4.993e+02, percent-clipped=0.0 2023-09-28 21:50:38,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:38,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 21:50:38,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 21:50:40,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 21:50:40,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:41,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:50:41,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:41,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:41,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 21:50:45,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:46,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:50:46,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:49,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=152920.0, ans=0.95 2023-09-28 21:50:50,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 21:50:54,697 INFO [train.py:1039] (0/4) Epoch 5, batch 1700, loss[loss=0.1988, simple_loss=0.2663, pruned_loss=0.06569, over 24442.00 frames. ], tot_loss[loss=0.2526, simple_loss=0.3105, pruned_loss=0.09733, over 4708371.77 frames. ], batch size: 58, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:50:56,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:56,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:50:56,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 21:50:57,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:50:57,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:50:57,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:58,350 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:50:59,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:50:59,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:51:01,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 21:51:04,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:51:08,205 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=152986.66666666666, ans=0.125 2023-09-28 21:51:13,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:51:13,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=153053.33333333334, ans=0.125 2023-09-28 21:51:14,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=153053.33333333334, ans=0.125 2023-09-28 21:51:15,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:51:19,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=153053.33333333334, ans=0.07 2023-09-28 21:51:22,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:51:22,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:24,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:51:24,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:26,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 21:51:28,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:51:28,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:29,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:51:31,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:51:34,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 21:51:34,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 21:51:35,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:39,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 21:51:39,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:51:46,396 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=153186.66666666666, ans=0.0 2023-09-28 21:51:48,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:51:49,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:51:50,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:53,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:51:54,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 21:51:54,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:57,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:57,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 21:51:58,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:51:58,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:51:59,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:59,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:00,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:00,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:52:01,002 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=153253.33333333334, ans=0.1 2023-09-28 21:52:02,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:02,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:52:02,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:05,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:05,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 21:52:09,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:10,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:12,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 21:52:12,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=153253.33333333334, ans=0.0 2023-09-28 21:52:16,903 INFO [train.py:1039] (0/4) Epoch 5, batch 1750, loss[loss=0.2605, simple_loss=0.3292, pruned_loss=0.09589, over 24350.00 frames. ], tot_loss[loss=0.2512, simple_loss=0.3091, pruned_loss=0.09666, over 4701069.06 frames. ], batch size: 77, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:52:20,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:22,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:22,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:52:25,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 21:52:25,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:52:27,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:52:27,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:30,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 21:52:34,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:36,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 21:52:36,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:38,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:52:42,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:52:44,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 21:52:44,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:52:45,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 21:52:53,945 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.07 vs. limit=15.0 2023-09-28 21:52:55,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:52:57,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:52:57,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:02,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:02,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:03,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=153453.33333333334, ans=0.125 2023-09-28 21:53:05,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:05,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:05,585 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=153520.0, ans=0.1 2023-09-28 21:53:09,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:09,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:53:10,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 21:53:13,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:17,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 21:53:18,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:20,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:20,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=153520.0, ans=0.125 2023-09-28 21:53:21,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:53:22,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=153586.66666666666, ans=0.125 2023-09-28 21:53:23,258 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.199e+02 2.496e+02 2.934e+02 4.192e+02, threshold=4.992e+02, percent-clipped=0.0 2023-09-28 21:53:25,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:53:25,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:53:26,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:26,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=153586.66666666666, ans=0.125 2023-09-28 21:53:28,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:31,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:34,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:53:36,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:53:36,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 21:53:36,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:38,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:53:38,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:53:38,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:53:38,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:53:39,937 INFO [train.py:1039] (0/4) Epoch 5, batch 1800, loss[loss=0.2427, simple_loss=0.2973, pruned_loss=0.09403, over 23406.00 frames. ], tot_loss[loss=0.2508, simple_loss=0.3085, pruned_loss=0.09654, over 4700769.42 frames. ], batch size: 134, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:53:40,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:53:42,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:53:43,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:45,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:53:48,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:52,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 21:53:53,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:56,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:53:58,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=153720.0, ans=0.125 2023-09-28 21:54:00,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.51 vs. limit=12.0 2023-09-28 21:54:01,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:01,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:03,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:54:06,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:54:06,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 21:54:06,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:08,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:12,034 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=153786.66666666666, ans=0.5 2023-09-28 21:54:13,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 21:54:16,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 21:54:16,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 21:54:18,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:18,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:18,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:54:19,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:54:26,514 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 21:54:28,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:54:28,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:31,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 21:54:31,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 21:54:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:54:32,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:54:35,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:54:39,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 21:54:44,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:54:46,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 21:54:46,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:54:46,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:46,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=153920.0, ans=0.125 2023-09-28 21:54:47,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:54:47,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 21:54:51,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:54:51,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:54:55,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 21:54:55,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:57,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:54:59,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:54:59,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:01,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:01,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:55:02,231 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.93 vs. limit=15.0 2023-09-28 21:55:02,693 INFO [train.py:1039] (0/4) Epoch 5, batch 1850, loss[loss=0.2527, simple_loss=0.3068, pruned_loss=0.09928, over 23419.00 frames. ], tot_loss[loss=0.2501, simple_loss=0.308, pruned_loss=0.09609, over 4702277.22 frames. ], batch size: 119, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:55:04,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:55:04,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:07,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:55:07,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:15,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:55:15,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 21:55:19,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 21:55:21,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 21:55:21,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=154053.33333333334, ans=0.125 2023-09-28 21:55:22,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=154053.33333333334, ans=0.125 2023-09-28 21:55:26,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:55:26,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 21:55:26,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:55:33,356 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=154053.33333333334, ans=0.125 2023-09-28 21:55:36,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:55:37,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 21:55:40,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:55:40,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:55:40,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=154120.0, ans=10.0 2023-09-28 21:55:44,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 21:55:44,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:46,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:55:47,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:55:49,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:52,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:57,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:55:57,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:58,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:55:58,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:00,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:02,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:56:05,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 21:56:08,703 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.275e+02 2.646e+02 3.136e+02 5.874e+02, threshold=5.291e+02, percent-clipped=3.0 2023-09-28 21:56:08,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:12,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=154253.33333333334, ans=0.0 2023-09-28 21:56:13,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:56:13,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:56:13,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 21:56:13,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 21:56:16,371 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 21:56:16,504 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 21:56:18,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:56:20,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:56:20,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:20,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:21,549 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 21:56:22,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:56:22,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:24,321 INFO [train.py:1039] (0/4) Epoch 5, batch 1900, loss[loss=0.2497, simple_loss=0.2989, pruned_loss=0.1003, over 23660.00 frames. ], tot_loss[loss=0.2505, simple_loss=0.3085, pruned_loss=0.09628, over 4707692.74 frames. ], batch size: 232, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:56:24,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:56:26,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:56:26,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=154320.0, ans=0.5 2023-09-28 21:56:27,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:56:28,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 21:56:31,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:31,177 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 21:56:31,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:56:32,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:35,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:39,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:56:41,094 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 21:56:42,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 21:56:43,086 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.89 vs. limit=15.0 2023-09-28 21:56:44,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:46,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:56:46,147 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 21:56:46,202 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 21:56:46,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=154386.66666666666, ans=0.125 2023-09-28 21:56:49,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 21:56:49,661 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=154386.66666666666, ans=0.0 2023-09-28 21:56:50,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:56:55,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 21:56:58,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 21:57:05,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 21:57:07,502 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:57:08,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 21:57:08,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:10,071 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 21:57:10,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 21:57:10,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=154453.33333333334, ans=0.025 2023-09-28 21:57:12,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 21:57:12,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 21:57:12,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:57:15,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 21:57:20,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:57:20,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=154520.0, ans=0.1 2023-09-28 21:57:22,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:22,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 21:57:25,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:57:27,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 21:57:28,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:35,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:57:35,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:57:35,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:57:36,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:57:38,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:57:38,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 21:57:39,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:57:42,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:42,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:57:46,319 INFO [train.py:1039] (0/4) Epoch 5, batch 1950, loss[loss=0.2474, simple_loss=0.3259, pruned_loss=0.08444, over 24040.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3108, pruned_loss=0.09811, over 4698157.54 frames. ], batch size: 80, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:57:46,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:57:46,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:46,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:48,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:51,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:57:53,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:57:55,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:55,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:57:58,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 21:57:58,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:57:59,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:59,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:02,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:58:03,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:03,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:06,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:09,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:58:09,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:58:09,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:58:09,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:14,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:17,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:58:17,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:17,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:58:17,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 21:58:19,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:58:19,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:58:21,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:24,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:29,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:58:32,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:58:32,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=154786.66666666666, ans=0.07 2023-09-28 21:58:32,934 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:58:35,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:58:35,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:58:37,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 21:58:37,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:58:41,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:41,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:58:42,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:58:45,182 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.57 vs. limit=15.0 2023-09-28 21:58:46,510 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.35 vs. limit=15.0 2023-09-28 21:58:50,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:52,295 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.827e+02 2.269e+02 2.574e+02 2.905e+02 4.607e+02, threshold=5.149e+02, percent-clipped=0.0 2023-09-28 21:58:52,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:54,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=154920.0, ans=0.2 2023-09-28 21:58:55,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:57,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:00,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:59:00,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:02,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 21:59:02,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:59:03,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:04,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 21:59:06,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:08,890 INFO [train.py:1039] (0/4) Epoch 5, batch 2000, loss[loss=0.2593, simple_loss=0.3076, pruned_loss=0.1055, over 23925.00 frames. ], tot_loss[loss=0.2548, simple_loss=0.3116, pruned_loss=0.09902, over 4694236.18 frames. ], batch size: 196, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 21:59:11,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:59:12,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:59:12,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:59:15,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:59:17,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:18,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 21:59:20,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:59:23,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:59:25,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 21:59:27,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:59:27,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:27,740 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=155053.33333333334, ans=0.025 2023-09-28 21:59:29,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=155053.33333333334, ans=0.125 2023-09-28 21:59:30,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:59:30,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 21:59:33,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:37,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 21:59:37,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:59:38,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 21:59:38,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:39,439 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.73 vs. limit=22.5 2023-09-28 21:59:41,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:59:43,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:59:43,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:45,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:59:46,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:59:46,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 21:59:48,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 21:59:48,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:48,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:55,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:58,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:59:58,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:59:58,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:59:58,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=155186.66666666666, ans=0.1 2023-09-28 22:00:00,443 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.05 vs. limit=10.0 2023-09-28 22:00:01,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:01,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:01,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:00:01,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:03,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:05,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:00:06,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 22:00:13,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:00:14,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:15,575 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=155253.33333333334, ans=0.015 2023-09-28 22:00:18,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:18,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:00:21,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:23,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:23,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:24,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:00:24,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:00:27,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:29,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:31,036 INFO [train.py:1039] (0/4) Epoch 5, batch 2050, loss[loss=0.2771, simple_loss=0.3254, pruned_loss=0.1144, over 23823.00 frames. ], tot_loss[loss=0.2546, simple_loss=0.3109, pruned_loss=0.09917, over 4688214.82 frames. ], batch size: 179, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:00:31,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:32,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:33,571 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.79 vs. limit=15.0 2023-09-28 22:00:35,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=155320.0, ans=0.0 2023-09-28 22:00:37,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:40,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:00:40,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:41,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:00:42,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 22:00:42,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:00:44,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:44,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:00:50,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=155386.66666666666, ans=0.125 2023-09-28 22:00:54,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:00:54,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:59,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 22:01:02,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:01:04,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 22:01:04,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:01:08,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:10,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:11,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:01:13,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:14,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:01:16,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:01:16,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:01:19,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:21,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:01:24,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:01:24,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:01:28,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:33,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:01:34,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 22:01:36,988 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.377e+02 2.692e+02 3.102e+02 5.014e+02, threshold=5.385e+02, percent-clipped=0.0 2023-09-28 22:01:41,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:42,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:01:43,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=155586.66666666666, ans=0.125 2023-09-28 22:01:45,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:01:46,749 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.43 vs. limit=10.0 2023-09-28 22:01:47,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 22:01:49,208 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 22:01:49,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:01:49,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:51,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:01:52,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:52,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 22:01:52,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 22:01:54,311 INFO [train.py:1039] (0/4) Epoch 5, batch 2100, loss[loss=0.241, simple_loss=0.3168, pruned_loss=0.08258, over 24461.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3079, pruned_loss=0.09775, over 4680439.68 frames. ], batch size: 69, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:01:54,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:57,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:01:58,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=155653.33333333334, ans=0.07 2023-09-28 22:01:59,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:02:01,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:01,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:02:01,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 22:02:04,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:02:04,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 22:02:04,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 22:02:05,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:05,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:07,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 22:02:07,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:02:12,804 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.42 vs. limit=12.0 2023-09-28 22:02:14,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=155720.0, ans=0.07 2023-09-28 22:02:15,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 22:02:15,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:02:18,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:02:18,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:02:23,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:02:23,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 22:02:25,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:25,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 22:02:28,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 22:02:28,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:28,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 22:02:28,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 22:02:28,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 22:02:31,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:02:33,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:02:35,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:35,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:38,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:39,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:39,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 22:02:39,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:39,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:41,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:43,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 22:02:44,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 22:02:46,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 22:02:46,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=155853.33333333334, ans=0.125 2023-09-28 22:02:48,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=155853.33333333334, ans=0.95 2023-09-28 22:02:50,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:02:54,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:55,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 22:03:00,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:04,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:03:04,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:04,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:04,860 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.59 vs. limit=15.0 2023-09-28 22:03:05,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 22:03:05,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:07,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:07,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:03:09,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:03:10,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:12,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 22:03:13,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 22:03:15,126 INFO [train.py:1039] (0/4) Epoch 5, batch 2150, loss[loss=0.2783, simple_loss=0.3244, pruned_loss=0.1161, over 23841.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3064, pruned_loss=0.09624, over 4685076.04 frames. ], batch size: 179, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:03:15,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:15,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=155986.66666666666, ans=0.95 2023-09-28 22:03:17,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=155986.66666666666, ans=0.125 2023-09-28 22:03:18,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:03:18,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:03:18,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:03:18,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:03:22,860 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=12.0 2023-09-28 22:03:25,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 22:03:27,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:28,048 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.81 vs. limit=6.0 2023-09-28 22:03:28,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:30,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:03:30,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:31,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:03:36,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:36,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:03:36,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:03:41,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:41,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 22:03:48,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:48,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:03:49,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:49,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:03:51,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:51,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:51,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:53,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 22:03:54,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:03:55,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:56,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:57,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:58,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:04:00,843 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=7.08 vs. limit=12.0 2023-09-28 22:04:01,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:01,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:04:03,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:03,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 22:04:03,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:04:03,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=156186.66666666666, ans=0.1 2023-09-28 22:04:06,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:06,611 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=156186.66666666666, ans=0.125 2023-09-28 22:04:07,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:09,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:11,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:04:12,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:12,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:12,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 22:04:14,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 22:04:15,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:04:15,835 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 22:04:15,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:17,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:04:19,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 22:04:19,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:04:19,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 22:04:19,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 22:04:19,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 22:04:19,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 22:04:21,002 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.209e+02 2.562e+02 3.022e+02 4.431e+02, threshold=5.124e+02, percent-clipped=0.0 2023-09-28 22:04:21,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=156253.33333333334, ans=0.05 2023-09-28 22:04:22,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:22,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:04:22,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:04:23,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=156253.33333333334, ans=0.125 2023-09-28 22:04:24,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:24,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:04:27,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:27,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:27,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=156253.33333333334, ans=0.125 2023-09-28 22:04:37,517 INFO [train.py:1039] (0/4) Epoch 5, batch 2200, loss[loss=0.2413, simple_loss=0.3111, pruned_loss=0.08581, over 24603.00 frames. ], tot_loss[loss=0.2493, simple_loss=0.3065, pruned_loss=0.096, over 4687019.17 frames. ], batch size: 68, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:04:37,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:04:37,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 22:04:39,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=156320.0, ans=0.125 2023-09-28 22:04:42,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:04:47,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:49,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:04:49,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:50,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:04:54,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:54,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:54,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 22:04:59,034 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=156386.66666666666, ans=0.1 2023-09-28 22:05:00,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 22:05:01,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:05:08,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 22:05:10,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:11,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:11,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:05:12,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=156453.33333333334, ans=0.125 2023-09-28 22:05:14,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=156453.33333333334, ans=0.125 2023-09-28 22:05:16,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:05:17,281 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:05:18,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 22:05:22,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:05:22,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:22,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:05:25,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:05:27,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:28,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:05:29,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=156520.0, ans=0.05 2023-09-28 22:05:30,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:32,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 22:05:34,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:36,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 22:05:37,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:37,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:05:37,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:39,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:39,870 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=156520.0, ans=0.0 2023-09-28 22:05:40,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:40,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:40,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:41,857 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.69 vs. limit=15.0 2023-09-28 22:05:42,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:05:42,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:05:44,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=156586.66666666666, ans=0.1 2023-09-28 22:05:46,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:05:49,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 22:05:50,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:05:54,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:05:54,805 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 22:05:57,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:05:57,800 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 22:05:59,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:05:59,417 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 22:06:00,778 INFO [train.py:1039] (0/4) Epoch 5, batch 2250, loss[loss=0.2529, simple_loss=0.3046, pruned_loss=0.1006, over 23727.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3076, pruned_loss=0.09604, over 4700627.14 frames. ], batch size: 164, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:06:02,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:02,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:06:03,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:06,210 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 22:06:07,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:06:09,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:14,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:06:16,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:06:17,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:19,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:19,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:21,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=156720.0, ans=0.125 2023-09-28 22:06:22,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 22:06:22,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:22,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:06:25,139 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=26.26 vs. limit=22.5 2023-09-28 22:06:25,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 22:06:25,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:06:25,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:28,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:32,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:34,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:06:34,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:06:35,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 22:06:37,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:41,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:06:44,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:47,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:48,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:50,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:52,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:53,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:06:58,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:07:00,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:07:05,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:07:05,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:07:06,671 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.161e+02 2.544e+02 3.130e+02 4.790e+02, threshold=5.087e+02, percent-clipped=0.0 2023-09-28 22:07:06,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:07:10,783 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=156920.0, ans=0.2 2023-09-28 22:07:13,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:07:16,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:07:16,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 22:07:16,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:18,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:07:21,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 22:07:22,990 INFO [train.py:1039] (0/4) Epoch 5, batch 2300, loss[loss=0.2537, simple_loss=0.3222, pruned_loss=0.09261, over 24052.00 frames. ], tot_loss[loss=0.2512, simple_loss=0.3087, pruned_loss=0.09684, over 4694761.90 frames. ], batch size: 80, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:07:24,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:07:24,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:28,685 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.13 vs. limit=15.0 2023-09-28 22:07:31,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:31,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:07:33,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=156986.66666666666, ans=0.125 2023-09-28 22:07:35,361 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 22:07:38,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:44,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:07:44,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:07:44,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:07:44,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:44,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 22:07:47,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:07:50,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:07:50,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:07:55,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:07:58,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:08:01,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:05,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=157120.0, ans=0.1 2023-09-28 22:08:06,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:08:06,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:08:10,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:08:13,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:08:17,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:08:18,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:08:18,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:08:18,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 22:08:21,703 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=157186.66666666666, ans=0.125 2023-09-28 22:08:23,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:08:23,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:24,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:24,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:08:26,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:26,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:08:26,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:08:28,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 22:08:28,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:08:28,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:29,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 22:08:35,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:08:39,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:08:44,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=157320.0, ans=0.125 2023-09-28 22:08:45,636 INFO [train.py:1039] (0/4) Epoch 5, batch 2350, loss[loss=0.239, simple_loss=0.2934, pruned_loss=0.09231, over 23534.00 frames. ], tot_loss[loss=0.2511, simple_loss=0.3089, pruned_loss=0.09662, over 4707866.19 frames. ], batch size: 106, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:08:45,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:45,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:08:45,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:08:49,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:08:49,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:08:49,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:08:50,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 22:08:53,402 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.21 vs. limit=22.5 2023-09-28 22:08:57,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:08:57,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 22:09:02,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 22:09:05,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:09:08,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:08,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:08,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:09,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:10,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 22:09:14,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=157386.66666666666, ans=0.125 2023-09-28 22:09:15,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:09:20,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 22:09:21,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:23,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:09:24,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:09:28,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:09:29,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 22:09:31,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:09:31,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=157453.33333333334, ans=0.125 2023-09-28 22:09:33,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:33,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:33,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:09:37,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:09:39,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 22:09:39,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:09:43,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:43,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:09:44,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 22:09:44,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:09:49,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 22:09:49,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:09:51,235 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.798e+02 2.164e+02 2.489e+02 2.830e+02 4.285e+02, threshold=4.978e+02, percent-clipped=0.0 2023-09-28 22:09:55,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 22:09:58,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 22:09:59,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:59,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:09:59,855 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 22:09:59,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 22:10:02,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 22:10:05,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:10:07,342 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=157653.33333333334, ans=0.125 2023-09-28 22:10:08,351 INFO [train.py:1039] (0/4) Epoch 5, batch 2400, loss[loss=0.2498, simple_loss=0.3166, pruned_loss=0.09152, over 24571.00 frames. ], tot_loss[loss=0.2524, simple_loss=0.3096, pruned_loss=0.09761, over 4708347.73 frames. ], batch size: 71, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:10:10,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:10:13,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:10:15,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:10:15,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 22:10:15,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 22:10:24,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:10:24,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:10:26,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 22:10:27,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=157720.0, ans=0.0 2023-09-28 22:10:29,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:10:30,172 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-09-28 22:10:30,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:31,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 22:10:33,183 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=157720.0, ans=0.125 2023-09-28 22:10:36,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:37,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 22:10:42,069 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:10:43,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:10:45,054 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=157786.66666666666, ans=0.0 2023-09-28 22:10:48,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 22:10:48,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=157786.66666666666, ans=0.09899494936611666 2023-09-28 22:10:49,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:10:52,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:53,683 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.01 vs. limit=12.0 2023-09-28 22:10:54,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=157786.66666666666, ans=0.0 2023-09-28 22:10:56,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:10:56,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 22:10:58,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:11:06,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:08,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:11,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:13,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:11:13,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:11:13,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:11:13,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:14,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:14,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:11:19,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:11:19,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:11:19,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 22:11:20,590 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.18 vs. limit=15.0 2023-09-28 22:11:21,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 22:11:23,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:11:24,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:25,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 22:11:25,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 22:11:25,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 22:11:25,149 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 22:11:26,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 22:11:26,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:11:28,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:28,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:31,675 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 22:11:31,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:33,729 INFO [train.py:1039] (0/4) Epoch 5, batch 2450, loss[loss=0.2594, simple_loss=0.3086, pruned_loss=0.1051, over 23458.00 frames. ], tot_loss[loss=0.2508, simple_loss=0.3079, pruned_loss=0.09686, over 4700458.84 frames. ], batch size: 119, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:11:33,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:11:37,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:11:37,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:37,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=157986.66666666666, ans=0.1 2023-09-28 22:11:40,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:40,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:40,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=157986.66666666666, ans=0.0 2023-09-28 22:11:41,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 22:11:48,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:48,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:50,440 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=158053.33333333334, ans=0.0 2023-09-28 22:11:51,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:11:51,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:11:51,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:11:51,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 22:11:53,816 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.28 vs. limit=12.0 2023-09-28 22:11:58,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:59,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:11:59,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:12:03,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:12:05,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:05,641 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=158120.0, ans=0.0 2023-09-28 22:12:06,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:06,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:12:10,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 22:12:10,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:12:10,769 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=158120.0, ans=0.1 2023-09-28 22:12:18,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:19,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:12:19,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:19,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:12:21,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:21,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:12:22,796 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.03 vs. limit=15.0 2023-09-28 22:12:23,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 22:12:26,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:28,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:12:31,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:12:31,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:37,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:12:37,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 22:12:38,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:12:39,973 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.190e+02 2.639e+02 3.152e+02 5.360e+02, threshold=5.279e+02, percent-clipped=2.0 2023-09-28 22:12:40,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:12:40,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 22:12:40,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:12:40,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:12:45,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:12:48,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:48,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:12:51,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 22:12:53,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:12:55,808 INFO [train.py:1039] (0/4) Epoch 5, batch 2500, loss[loss=0.2566, simple_loss=0.3004, pruned_loss=0.1064, over 23696.00 frames. ], tot_loss[loss=0.2497, simple_loss=0.3067, pruned_loss=0.09634, over 4692598.66 frames. ], batch size: 232, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:12:56,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=158320.0, ans=0.0 2023-09-28 22:13:01,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:04,389 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:13:11,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:13:11,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:13:13,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:13,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 22:13:20,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:13:21,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:21,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:13:21,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:13:22,809 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.99 vs. limit=15.0 2023-09-28 22:13:23,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 22:13:23,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:25,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:25,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 22:13:25,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:26,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 22:13:26,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:31,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:13:31,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:34,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:13:36,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 22:13:36,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:13:39,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:43,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:48,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:51,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:13:56,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:13:57,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=158520.0, ans=0.0 2023-09-28 22:13:58,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 22:13:58,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:58,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:01,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:14:01,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:14:01,717 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 22:14:01,718 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 22:14:01,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 22:14:06,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:06,476 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:14:09,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 22:14:09,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 22:14:11,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:14:11,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 22:14:11,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=158586.66666666666, ans=0.1 2023-09-28 22:14:16,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 22:14:20,172 INFO [train.py:1039] (0/4) Epoch 5, batch 2550, loss[loss=0.2598, simple_loss=0.3174, pruned_loss=0.1011, over 23378.00 frames. ], tot_loss[loss=0.2503, simple_loss=0.3078, pruned_loss=0.09634, over 4706181.86 frames. ], batch size: 93, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:14:20,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:22,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:14:22,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=158653.33333333334, ans=0.0 2023-09-28 22:14:23,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:14:25,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:26,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 22:14:28,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:14:31,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 22:14:32,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:14:34,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:37,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:14:37,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 22:14:37,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:14:38,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:14:38,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:40,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=158720.0, ans=0.125 2023-09-28 22:14:43,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:14:43,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 22:14:43,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:43,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:43,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 22:14:51,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=158786.66666666666, ans=0.0 2023-09-28 22:14:58,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=158786.66666666666, ans=0.0 2023-09-28 22:14:59,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:15:04,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:04,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:04,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:15:07,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:15:12,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:15:16,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:15:16,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:15:16,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:15:17,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:15:17,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:15:21,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:21,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:24,944 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.422e+02 2.826e+02 3.525e+02 6.917e+02, threshold=5.653e+02, percent-clipped=3.0 2023-09-28 22:15:29,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:15:30,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 22:15:30,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:15:30,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:30,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:15:33,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:15:34,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:39,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:15:42,638 INFO [train.py:1039] (0/4) Epoch 5, batch 2600, loss[loss=0.2436, simple_loss=0.298, pruned_loss=0.0946, over 23510.00 frames. ], tot_loss[loss=0.2495, simple_loss=0.3081, pruned_loss=0.09547, over 4725858.61 frames. ], batch size: 256, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:15:42,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:45,743 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 22:15:47,483 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 22:15:47,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:15:48,885 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 22:15:49,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 22:15:49,022 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 22:15:52,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:52,095 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 22:15:52,461 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=158986.66666666666, ans=0.1 2023-09-28 22:15:53,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 22:15:55,261 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 22:15:56,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:15:58,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 22:16:01,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 22:16:02,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:16:02,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 22:16:04,584 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 22:16:04,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 22:16:13,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=159053.33333333334, ans=0.07 2023-09-28 22:16:14,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=159120.0, ans=0.2 2023-09-28 22:16:15,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:15,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:16,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:16,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 22:16:19,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:16:23,919 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 22:16:24,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=159120.0, ans=0.95 2023-09-28 22:16:28,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:28,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:30,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 22:16:31,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:31,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:33,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 22:16:37,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:16:37,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:16:38,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:41,735 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-09-28 22:16:42,551 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 22:16:42,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:42,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:16:47,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:48,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:16:50,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 22:16:50,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:53,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:16:53,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:16:59,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 22:17:00,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:02,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:17:02,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=159320.0, ans=0.125 2023-09-28 22:17:04,015 INFO [train.py:1039] (0/4) Epoch 5, batch 2650, loss[loss=0.272, simple_loss=0.3209, pruned_loss=0.1116, over 23329.00 frames. ], tot_loss[loss=0.2505, simple_loss=0.3095, pruned_loss=0.09577, over 4725893.22 frames. ], batch size: 119, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:17:09,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 22:17:09,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:11,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:17:11,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=159320.0, ans=0.125 2023-09-28 22:17:12,813 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 22:17:12,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:15,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:17,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:17:18,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:17:21,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:17:22,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 22:17:22,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:17:22,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:17:25,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 22:17:28,185 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 22:17:31,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:32,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 22:17:32,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:17:32,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 22:17:37,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=159453.33333333334, ans=0.125 2023-09-28 22:17:38,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:38,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:17:38,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:39,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:42,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 22:17:42,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 22:17:47,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:17:50,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 22:17:52,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:52,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:52,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:17:53,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:53,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:55,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:55,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=159520.0, ans=0.125 2023-09-28 22:17:58,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:17:59,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:59,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:18:02,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:18:04,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:05,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:18:05,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:08,809 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.817e+02 2.225e+02 2.541e+02 3.251e+02 5.495e+02, threshold=5.083e+02, percent-clipped=0.0 2023-09-28 22:18:08,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:18:08,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:18:12,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:13,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:18:13,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:13,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 22:18:17,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:18:20,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:22,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:24,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:24,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:18:25,833 INFO [train.py:1039] (0/4) Epoch 5, batch 2700, loss[loss=0.2514, simple_loss=0.2962, pruned_loss=0.1033, over 22691.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3089, pruned_loss=0.09494, over 4728408.55 frames. ], batch size: 322, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:18:25,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:28,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:18:28,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 22:18:31,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:18:32,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:18:35,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:18:35,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:35,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:37,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=159653.33333333334, ans=0.2 2023-09-28 22:18:38,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:18:38,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:38,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:18:38,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:18:38,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 22:18:40,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:18:41,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:18:43,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:18:43,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:46,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:18:49,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 22:18:49,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:18:52,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=159720.0, ans=0.125 2023-09-28 22:18:54,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:18:54,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:18:54,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=159720.0, ans=0.125 2023-09-28 22:19:01,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:19:01,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:19:02,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:19:02,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:19:04,282 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.81 vs. limit=12.0 2023-09-28 22:19:04,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=159786.66666666666, ans=0.2 2023-09-28 22:19:06,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:06,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=159786.66666666666, ans=0.125 2023-09-28 22:19:07,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:07,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:19:07,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:19:12,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:12,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:19:20,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:19:21,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:19:23,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:19:23,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:30,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:30,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:31,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:33,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:35,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:35,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:19:37,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:19:38,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:38,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:41,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 22:19:41,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:44,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:19:44,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 22:19:47,730 INFO [train.py:1039] (0/4) Epoch 5, batch 2750, loss[loss=0.2555, simple_loss=0.2908, pruned_loss=0.1101, over 23598.00 frames. ], tot_loss[loss=0.2501, simple_loss=0.3084, pruned_loss=0.09592, over 4704819.60 frames. ], batch size: 256, lr: 1.99e-02, grad_scale: 16.0 2023-09-28 22:19:47,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 22:19:47,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:49,617 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-24000.pt 2023-09-28 22:19:54,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:19:54,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:57,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:57,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:19:59,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:00,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:00,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:20:01,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=159986.66666666666, ans=0.125 2023-09-28 22:20:02,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:20:02,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:02,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 22:20:02,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:20:02,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:20:11,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 22:20:12,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:20:12,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:14,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:14,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:20:15,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:20:17,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:20:17,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:19,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:20,771 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=160053.33333333334, ans=0.125 2023-09-28 22:20:23,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:20:23,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=160120.0, ans=0.035 2023-09-28 22:20:25,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:20:25,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:20:26,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:26,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=160120.0, ans=0.125 2023-09-28 22:20:28,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:20:30,096 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=160120.0, ans=0.0 2023-09-28 22:20:36,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:38,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:20:38,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:42,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:42,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:20:42,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:20:50,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:20:50,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:50,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 22:20:54,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:56,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 22:20:57,661 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.252e+02 2.534e+02 3.037e+02 4.293e+02, threshold=5.069e+02, percent-clipped=0.0 2023-09-28 22:21:02,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:21:04,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:21:04,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 22:21:05,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:21:08,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:21:08,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 22:21:08,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:21:12,448 INFO [train.py:1039] (0/4) Epoch 5, batch 2800, loss[loss=0.2361, simple_loss=0.2753, pruned_loss=0.09843, over 23397.00 frames. ], tot_loss[loss=0.2497, simple_loss=0.3065, pruned_loss=0.09645, over 4687853.68 frames. ], batch size: 285, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:21:12,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:21:12,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:14,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:21:14,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 22:21:14,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:17,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:18,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:18,668 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 22:21:18,669 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 22:21:21,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:22,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=160320.0, ans=0.07 2023-09-28 22:21:24,635 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.87 vs. limit=15.0 2023-09-28 22:21:25,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:21:25,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:21:28,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:21:30,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=160386.66666666666, ans=0.125 2023-09-28 22:21:31,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 22:21:34,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:21:35,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 22:21:37,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:37,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:21:37,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:21:40,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:21:42,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:42,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:21:44,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:21:54,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:21:56,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:58,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:58,305 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=160453.33333333334, ans=0.0 2023-09-28 22:22:01,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:22:01,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:03,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=160520.0, ans=0.125 2023-09-28 22:22:04,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:04,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 22:22:05,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:06,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:06,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:22:08,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=160520.0, ans=0.0 2023-09-28 22:22:09,499 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:22:10,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:10,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:13,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:15,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:22:15,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:15,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:22:17,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:22:17,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:22:18,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:22:19,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 22:22:19,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:21,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:22:21,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:25,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 22:22:25,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:25,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:22:26,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:22:28,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 22:22:33,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:34,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:22:34,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:22:35,947 INFO [train.py:1039] (0/4) Epoch 5, batch 2850, loss[loss=0.2736, simple_loss=0.3241, pruned_loss=0.1116, over 23470.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.3057, pruned_loss=0.09581, over 4682565.35 frames. ], batch size: 134, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:22:36,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=160653.33333333334, ans=0.125 2023-09-28 22:22:37,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:42,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:22:42,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:22:43,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:45,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:45,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:47,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:22:47,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 22:22:53,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 22:22:53,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:55,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 22:22:56,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:59,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 22:23:01,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 22:23:03,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:11,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=160786.66666666666, ans=0.0 2023-09-28 22:23:15,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:16,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:18,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:23:18,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:23:18,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:23:19,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:23:21,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:23:22,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 22:23:25,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:23:25,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:23:27,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:27,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:30,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:30,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:33,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:33,894 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=15.0 2023-09-28 22:23:34,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:37,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:23:38,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:40,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:41,935 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.140e+02 2.376e+02 2.803e+02 4.746e+02, threshold=4.753e+02, percent-clipped=0.0 2023-09-28 22:23:42,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:23:45,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=160920.0, ans=0.0 2023-09-28 22:23:46,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:23:48,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 22:23:48,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 22:23:49,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:23:50,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:50,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 22:23:51,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:23:51,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:51,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:53,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:23:53,190 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 22:23:53,247 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 22:23:53,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:23:54,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:56,275 INFO [train.py:1039] (0/4) Epoch 5, batch 2900, loss[loss=0.2768, simple_loss=0.3272, pruned_loss=0.1133, over 23798.00 frames. ], tot_loss[loss=0.2485, simple_loss=0.3058, pruned_loss=0.09555, over 4694392.74 frames. ], batch size: 212, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:23:58,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:23:58,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:58,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:24:00,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 22:24:06,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:06,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 22:24:07,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 22:24:09,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:24:09,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:24:12,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:12,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:24:15,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:24:15,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:17,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:24:18,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 22:24:19,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:24:20,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:23,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 22:24:24,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 22:24:27,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:24:27,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 22:24:27,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:24:31,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:24:31,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:24:34,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:35,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:39,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:24:42,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:24:43,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 22:24:45,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 22:24:45,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:24:48,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:24:48,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=161186.66666666666, ans=0.125 2023-09-28 22:24:51,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 22:24:51,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:24:53,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=161186.66666666666, ans=0.0 2023-09-28 22:24:57,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:25:07,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:25:07,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:25:07,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=161253.33333333334, ans=0.125 2023-09-28 22:25:09,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 22:25:13,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:13,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 22:25:13,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=161253.33333333334, ans=0.125 2023-09-28 22:25:15,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:15,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:25:20,023 INFO [train.py:1039] (0/4) Epoch 5, batch 2950, loss[loss=0.2516, simple_loss=0.3136, pruned_loss=0.09481, over 23927.00 frames. ], tot_loss[loss=0.2508, simple_loss=0.3082, pruned_loss=0.0967, over 4686886.33 frames. ], batch size: 86, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:25:22,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:23,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 22:25:25,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:25,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:26,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:25:28,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:25:28,568 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:25:29,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 22:25:29,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 22:25:31,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:25:31,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:33,430 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.75 vs. limit=22.5 2023-09-28 22:25:38,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:25:40,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:25:45,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:25:45,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:25:49,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:25:49,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:25:49,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:25:56,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 22:25:57,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 22:25:59,156 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 22:25:59,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:26:02,223 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 22:26:03,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 22:26:03,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:26:03,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:26:03,903 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 22:26:05,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:26:06,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 22:26:08,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:26:09,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:26:11,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:13,325 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=161520.0, ans=0.125 2023-09-28 22:26:13,720 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.83 vs. limit=22.5 2023-09-28 22:26:14,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:26:14,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:14,534 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 22:26:14,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:14,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 22:26:21,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:22,115 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=161520.0, ans=0.2 2023-09-28 22:26:23,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:26:24,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 22:26:24,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:26:24,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=161586.66666666666, ans=0.1 2023-09-28 22:26:26,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 22:26:27,625 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.929e+02 2.317e+02 2.702e+02 3.273e+02 4.611e+02, threshold=5.405e+02, percent-clipped=0.0 2023-09-28 22:26:29,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:30,253 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.62 vs. limit=15.0 2023-09-28 22:26:32,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:26:32,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:26:34,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:34,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:26:34,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:26:35,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:35,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:26:36,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=161586.66666666666, ans=0.125 2023-09-28 22:26:37,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:26:38,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:39,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:26:40,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:40,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 22:26:41,944 INFO [train.py:1039] (0/4) Epoch 5, batch 3000, loss[loss=0.2776, simple_loss=0.3231, pruned_loss=0.116, over 23350.00 frames. ], tot_loss[loss=0.2512, simple_loss=0.3093, pruned_loss=0.09661, over 4704299.85 frames. ], batch size: 285, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:26:41,945 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 22:26:57,272 INFO [train.py:1071] (0/4) Epoch 5, validation: loss=0.3788, simple_loss=0.3301, pruned_loss=0.2137, over 1125622.00 frames. 2023-09-28 22:26:57,272 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 22:26:57,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:59,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:00,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:27:04,954 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 22:27:05,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 22:27:07,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:27:07,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:27:08,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 22:27:08,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:12,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:27:17,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=161720.0, ans=0.1 2023-09-28 22:27:22,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:27:25,261 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.50 vs. limit=15.0 2023-09-28 22:27:30,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 22:27:32,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:27:35,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:27:35,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:37,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:27:38,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:39,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 22:27:42,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 22:27:43,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:27:45,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:27:46,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:27:46,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:48,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:48,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:27:52,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:27:52,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:52,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:27:52,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=161853.33333333334, ans=0.0 2023-09-28 22:27:55,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:57,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 22:27:59,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:28:00,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:01,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:28:04,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:04,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:06,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:28:08,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 22:28:08,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:08,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 22:28:10,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:28:11,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 22:28:14,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:17,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:28:17,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 22:28:19,150 INFO [train.py:1039] (0/4) Epoch 5, batch 3050, loss[loss=0.2641, simple_loss=0.306, pruned_loss=0.1111, over 23675.00 frames. ], tot_loss[loss=0.2508, simple_loss=0.3093, pruned_loss=0.09619, over 4719833.08 frames. ], batch size: 232, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:28:19,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 22:28:19,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:28:20,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:28:20,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:20,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:28:22,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:22,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:28:25,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 22:28:25,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=161986.66666666666, ans=0.1 2023-09-28 22:28:27,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:28:28,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:30,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:28:33,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:38,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 22:28:45,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 22:28:45,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 22:28:45,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:28:51,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:28:52,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:52,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:54,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:28:57,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:28:57,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:57,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:58,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:58,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:00,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:02,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:03,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:05,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 22:29:07,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:07,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:29:10,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:29:11,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:29:11,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:11,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:18,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:18,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:25,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:25,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:29:25,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:27,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:27,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:29:28,682 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.152e+02 2.476e+02 2.829e+02 3.891e+02, threshold=4.952e+02, percent-clipped=0.0 2023-09-28 22:29:28,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:29:30,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 22:29:31,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:31,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:33,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 22:29:33,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=162253.33333333334, ans=0.1 2023-09-28 22:29:34,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:40,980 INFO [train.py:1039] (0/4) Epoch 5, batch 3100, loss[loss=0.251, simple_loss=0.2849, pruned_loss=0.1085, over 19958.00 frames. ], tot_loss[loss=0.2506, simple_loss=0.3091, pruned_loss=0.09605, over 4718979.14 frames. ], batch size: 388, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:29:42,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:44,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:29:47,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:29:49,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 22:29:51,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 22:29:51,773 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=162320.0, ans=0.125 2023-09-28 22:29:53,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 22:29:53,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:29:58,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:58,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:00,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:30:03,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:06,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 22:30:12,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:30:13,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:14,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:14,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:30:14,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:30:18,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:30:18,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 22:30:18,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:30:20,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:20,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 22:30:22,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:30:25,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:30:27,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 22:30:27,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 22:30:30,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:31,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:34,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:34,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:35,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:30:37,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:30:37,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:30:38,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:30:38,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:30:38,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:38,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 22:30:39,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=162520.0, ans=0.125 2023-09-28 22:30:44,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:46,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 22:30:46,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=162586.66666666666, ans=0.2 2023-09-28 22:30:49,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:30:49,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 22:30:51,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:51,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:53,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 22:31:04,535 INFO [train.py:1039] (0/4) Epoch 5, batch 3150, loss[loss=0.2613, simple_loss=0.3256, pruned_loss=0.0985, over 23932.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.3071, pruned_loss=0.09519, over 4719445.73 frames. ], batch size: 86, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:31:04,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 22:31:08,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:08,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:10,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:31:10,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:31:10,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 22:31:10,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=162653.33333333334, ans=0.125 2023-09-28 22:31:11,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:11,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:31:13,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 22:31:15,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:16,824 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 22:31:17,808 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.11 vs. limit=22.5 2023-09-28 22:31:18,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 22:31:18,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:31:20,049 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 22:31:21,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:31:24,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 22:31:25,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 22:31:25,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 22:31:25,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:25,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:26,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:30,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 22:31:31,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:31,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:33,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:34,000 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.70 vs. limit=6.0 2023-09-28 22:31:36,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:31:40,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 22:31:42,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:31:43,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:31:45,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:45,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 22:31:48,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 22:31:49,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:31:50,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:31:51,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:31:51,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:51,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:31:53,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:31:53,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:31:54,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 22:31:54,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:31:54,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:31:57,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:31:57,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:57,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 22:31:59,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:01,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 22:32:01,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:03,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 22:32:05,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 22:32:06,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:32:08,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:09,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 22:32:09,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:32:11,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:32:14,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:32:15,469 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.257e+02 2.534e+02 2.930e+02 4.234e+02, threshold=5.067e+02, percent-clipped=0.0 2023-09-28 22:32:15,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:17,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:32:21,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:32:21,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:24,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 22:32:27,687 INFO [train.py:1039] (0/4) Epoch 5, batch 3200, loss[loss=0.2445, simple_loss=0.3183, pruned_loss=0.08532, over 24283.00 frames. ], tot_loss[loss=0.2467, simple_loss=0.3051, pruned_loss=0.09413, over 4705669.55 frames. ], batch size: 74, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:32:30,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:32:30,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:32:31,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=162986.66666666666, ans=0.0 2023-09-28 22:32:34,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:34,716 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=10.82 vs. limit=12.0 2023-09-28 22:32:36,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:32:36,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 22:32:40,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:43,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=163053.33333333334, ans=0.125 2023-09-28 22:32:44,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:32:44,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=163053.33333333334, ans=0.2 2023-09-28 22:32:48,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:52,393 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=163053.33333333334, ans=0.0 2023-09-28 22:32:57,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=163053.33333333334, ans=0.0 2023-09-28 22:32:58,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:33:07,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 22:33:07,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:33:11,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 22:33:13,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:33:16,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:33:16,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:33:17,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:33:22,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 22:33:24,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:33:26,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 22:33:28,060 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:33:30,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 22:33:33,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:33:34,643 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.49 vs. limit=15.0 2023-09-28 22:33:39,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:39,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:33:39,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:40,011 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 22:33:40,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:33:40,521 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.10 vs. limit=15.0 2023-09-28 22:33:45,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:33:47,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 22:33:48,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 22:33:48,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 22:33:49,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 22:33:50,396 INFO [train.py:1039] (0/4) Epoch 5, batch 3250, loss[loss=0.2338, simple_loss=0.3118, pruned_loss=0.07786, over 24560.00 frames. ], tot_loss[loss=0.2457, simple_loss=0.3043, pruned_loss=0.09354, over 4696586.82 frames. ], batch size: 71, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:33:52,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:33:53,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:33:53,767 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 22:33:55,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:33:55,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:33:56,771 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 22:34:02,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:34:05,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:13,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:34:13,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 22:34:13,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:14,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:34:14,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:16,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:17,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:34:20,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:20,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:34:20,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:22,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:34:25,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:26,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:28,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:29,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:30,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:32,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:32,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:34:39,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=163520.0, ans=0.0 2023-09-28 22:34:40,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 22:34:40,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:40,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:34:41,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:43,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:34:48,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:34:48,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=163520.0, ans=0.07 2023-09-28 22:34:51,956 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.38 vs. limit=15.0 2023-09-28 22:34:54,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:34:55,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:55,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 22:34:55,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:34:55,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:34:57,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:59,765 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.181e+02 2.539e+02 2.910e+02 4.275e+02, threshold=5.078e+02, percent-clipped=0.0 2023-09-28 22:34:59,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 22:34:59,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 22:35:00,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:35:01,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:01,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:03,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:35:03,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:08,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:08,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:11,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 22:35:11,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:13,133 INFO [train.py:1039] (0/4) Epoch 5, batch 3300, loss[loss=0.249, simple_loss=0.3189, pruned_loss=0.08952, over 23984.00 frames. ], tot_loss[loss=0.2461, simple_loss=0.3053, pruned_loss=0.09349, over 4707583.24 frames. ], batch size: 80, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:35:13,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:35:13,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 22:35:16,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:35:16,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 22:35:19,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 22:35:19,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 22:35:19,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:22,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:24,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:35:24,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:27,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:35:27,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:35:31,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:33,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:34,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=163720.0, ans=0.125 2023-09-28 22:35:36,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 22:35:36,353 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:35:37,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:35:37,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:40,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:40,253 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 22:35:41,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:35:41,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:35:43,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:35:43,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:35:43,444 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 22:35:47,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:47,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:35:50,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:50,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 22:35:50,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 22:35:51,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:51,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:35:55,070 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 22:35:56,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 22:35:58,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:00,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 22:36:01,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:03,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:36:05,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:05,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=163853.33333333334, ans=0.125 2023-09-28 22:36:08,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:08,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:08,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:36:09,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:36:11,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:36:11,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:13,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:36:15,293 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 22:36:15,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 22:36:17,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:36:18,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:36:18,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:20,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:20,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:22,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:36:23,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:23,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:36:23,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:26,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:36:28,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 22:36:28,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:30,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:33,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:36:33,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:35,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:36,544 INFO [train.py:1039] (0/4) Epoch 5, batch 3350, loss[loss=0.2832, simple_loss=0.3247, pruned_loss=0.1208, over 23402.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.3073, pruned_loss=0.09493, over 4701060.65 frames. ], batch size: 285, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:36:36,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:36,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:41,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:41,992 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=163986.66666666666, ans=0.125 2023-09-28 22:36:43,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:43,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=163986.66666666666, ans=0.125 2023-09-28 22:36:44,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:47,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:50,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:36:51,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:51,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:36:54,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 22:36:58,141 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 22:36:58,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:37:01,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 22:37:01,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 22:37:01,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:37:01,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:37:03,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:04,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 22:37:04,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:04,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:37:06,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:08,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:37:14,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:16,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=164120.0, ans=0.0 2023-09-28 22:37:17,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:17,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:22,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:37:22,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:24,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:24,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:26,174 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=164186.66666666666, ans=0.1 2023-09-28 22:37:27,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=164186.66666666666, ans=0.0 2023-09-28 22:37:28,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:30,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 22:37:30,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=164186.66666666666, ans=0.07 2023-09-28 22:37:31,362 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=25.00 vs. limit=22.5 2023-09-28 22:37:31,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:37:31,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 22:37:32,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:37:34,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 22:37:34,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:35,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:40,736 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=164253.33333333334, ans=0.125 2023-09-28 22:37:42,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:42,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 22:37:44,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:37:44,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=164253.33333333334, ans=0.07 2023-09-28 22:37:46,088 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.765e+02 2.271e+02 2.616e+02 3.188e+02 4.875e+02, threshold=5.232e+02, percent-clipped=0.0 2023-09-28 22:37:46,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:37:46,545 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:37:47,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:37:50,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:37:52,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=164253.33333333334, ans=0.2 2023-09-28 22:37:53,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 22:37:54,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:37:55,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:37:56,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:57,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 22:37:58,971 INFO [train.py:1039] (0/4) Epoch 5, batch 3400, loss[loss=0.2458, simple_loss=0.3206, pruned_loss=0.08552, over 24684.00 frames. ], tot_loss[loss=0.2491, simple_loss=0.3078, pruned_loss=0.09524, over 4702930.41 frames. ], batch size: 73, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:37:59,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:59,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 22:38:02,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:38:03,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:38:05,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 22:38:10,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 22:38:10,505 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 22:38:10,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:15,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:38:15,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:38:15,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:16,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:38:22,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:22,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 22:38:27,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:38:30,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:32,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:32,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:38:38,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:38:43,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 22:38:49,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 22:38:51,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:38:53,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:53,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:53,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:38:56,086 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=164520.0, ans=0.1 2023-09-28 22:38:58,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:39:00,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=164520.0, ans=0.2 2023-09-28 22:39:01,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:39:01,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:39:07,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:09,588 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=25.42 vs. limit=22.5 2023-09-28 22:39:10,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 22:39:14,431 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=22.5 2023-09-28 22:39:16,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:39:20,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=164653.33333333334, ans=0.0 2023-09-28 22:39:21,481 INFO [train.py:1039] (0/4) Epoch 5, batch 3450, loss[loss=0.2405, simple_loss=0.309, pruned_loss=0.08603, over 24661.00 frames. ], tot_loss[loss=0.2481, simple_loss=0.3065, pruned_loss=0.09479, over 4704652.30 frames. ], batch size: 68, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:39:21,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 22:39:25,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 22:39:27,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:39:28,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:39:28,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 22:39:29,621 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.82 vs. limit=6.0 2023-09-28 22:39:31,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:36,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:39:39,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:39:39,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:39:40,270 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.68 vs. limit=15.0 2023-09-28 22:39:41,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:39:41,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:43,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:45,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=164720.0, ans=0.2 2023-09-28 22:39:48,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 22:39:54,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 22:39:54,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:39:54,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:39:55,590 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.85 vs. limit=22.5 2023-09-28 22:39:55,837 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.21 vs. limit=15.0 2023-09-28 22:39:57,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:02,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 22:40:04,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:40:09,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:09,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:40:11,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:40:13,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:40:15,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 22:40:15,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:16,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:40:18,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:40:21,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 22:40:24,499 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.64 vs. limit=22.5 2023-09-28 22:40:25,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:40:29,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:40:31,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:33,109 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.242e+02 2.572e+02 2.948e+02 4.937e+02, threshold=5.144e+02, percent-clipped=0.0 2023-09-28 22:40:33,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:35,844 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=164920.0, ans=0.1 2023-09-28 22:40:37,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=164920.0, ans=0.1 2023-09-28 22:40:39,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:40,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:40,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:40:40,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:42,583 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:40:45,221 INFO [train.py:1039] (0/4) Epoch 5, batch 3500, loss[loss=0.2335, simple_loss=0.3034, pruned_loss=0.08176, over 24497.00 frames. ], tot_loss[loss=0.2471, simple_loss=0.3055, pruned_loss=0.09437, over 4707231.24 frames. ], batch size: 66, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:40:45,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:50,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:40:50,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 22:40:52,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:40:56,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:41:01,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:41:01,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 22:41:06,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:41:06,687 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=165053.33333333334, ans=0.125 2023-09-28 22:41:07,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:41:09,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:41:09,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:09,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:41:09,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=165053.33333333334, ans=0.2 2023-09-28 22:41:11,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:11,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:13,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 22:41:16,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:16,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:41:19,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:22,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:22,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 22:41:22,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:26,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:29,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:41:29,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:31,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:41:31,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:32,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 22:41:34,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 22:41:34,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 22:41:34,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=165186.66666666666, ans=0.125 2023-09-28 22:41:35,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:37,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:39,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:39,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:41:44,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:41:44,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:41:51,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:41:53,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 22:41:53,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 22:41:53,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:41:56,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:41:56,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:41:56,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:59,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 22:42:01,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:42:03,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:42:04,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 22:42:06,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 22:42:07,720 INFO [train.py:1039] (0/4) Epoch 5, batch 3550, loss[loss=0.2483, simple_loss=0.3065, pruned_loss=0.09507, over 23427.00 frames. ], tot_loss[loss=0.2462, simple_loss=0.3052, pruned_loss=0.0936, over 4710344.18 frames. ], batch size: 93, lr: 1.96e-02, grad_scale: 16.0 2023-09-28 22:42:07,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:09,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:42:09,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:11,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:13,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=165320.0, ans=0.1 2023-09-28 22:42:14,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:42:16,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=165320.0, ans=0.2 2023-09-28 22:42:18,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=165320.0, ans=0.125 2023-09-28 22:42:23,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=165386.66666666666, ans=0.0 2023-09-28 22:42:25,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:26,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:42:28,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:30,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:42:30,765 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.49 vs. limit=10.0 2023-09-28 22:42:32,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:33,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:42:33,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:42:36,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:36,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:42:36,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:37,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:42:38,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:42:43,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:42:43,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=165453.33333333334, ans=0.0 2023-09-28 22:42:44,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:46,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:42:46,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:46,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:42:46,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 22:42:46,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:50,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:52,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:42:57,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:57,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:59,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:02,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 22:43:02,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:43:04,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 22:43:04,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:43:05,978 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=165520.0, ans=0.0 2023-09-28 22:43:07,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:43:07,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:43:11,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 22:43:12,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:17,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:17,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 22:43:19,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:20,844 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.191e+02 2.567e+02 2.914e+02 4.741e+02, threshold=5.134e+02, percent-clipped=0.0 2023-09-28 22:43:24,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:43:28,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 22:43:33,470 INFO [train.py:1039] (0/4) Epoch 5, batch 3600, loss[loss=0.2303, simple_loss=0.3061, pruned_loss=0.07729, over 24575.00 frames. ], tot_loss[loss=0.2458, simple_loss=0.3049, pruned_loss=0.09335, over 4708966.25 frames. ], batch size: 71, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:43:35,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 22:43:35,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:43:36,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:43:36,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:37,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=165653.33333333334, ans=0.125 2023-09-28 22:43:38,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:38,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:43:43,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:43:43,210 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=165653.33333333334, ans=0.125 2023-09-28 22:43:45,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:46,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:43:46,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:43:48,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:48,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 22:43:51,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:43:54,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:56,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:43:59,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:00,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=165720.0, ans=0.125 2023-09-28 22:44:01,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:44:01,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:44:03,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 22:44:04,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:44:08,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:44:08,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:44:11,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:14,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:14,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:14,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 22:44:14,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=165786.66666666666, ans=0.1 2023-09-28 22:44:16,721 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.37 vs. limit=15.0 2023-09-28 22:44:22,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:24,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:44:25,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 22:44:25,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=165853.33333333334, ans=0.1 2023-09-28 22:44:27,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=165853.33333333334, ans=0.05 2023-09-28 22:44:30,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:44:35,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:37,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:45,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:44:45,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:44:45,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 22:44:47,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 22:44:47,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 22:44:50,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:50,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:44:51,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 22:44:51,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:44:53,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:44:53,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:54,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=165986.66666666666, ans=0.1 2023-09-28 22:44:55,046 INFO [train.py:1039] (0/4) Epoch 5, batch 3650, loss[loss=0.2688, simple_loss=0.3182, pruned_loss=0.1097, over 23733.00 frames. ], tot_loss[loss=0.2452, simple_loss=0.3049, pruned_loss=0.09278, over 4722842.88 frames. ], batch size: 179, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:44:55,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 22:44:55,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 22:44:57,660 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.05 vs. limit=15.0 2023-09-28 22:44:58,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:59,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 22:45:04,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 22:45:05,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:45:09,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 22:45:10,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 22:45:15,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:15,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:45:15,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:45:18,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:45:20,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:45:20,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 22:45:21,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:45:21,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:23,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 22:45:25,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:45:25,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:45:25,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:25,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=166053.33333333334, ans=0.1 2023-09-28 22:45:28,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:45:30,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 22:45:33,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 22:45:33,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:45:34,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 22:45:36,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:36,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:45:42,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:45:43,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:44,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:45:46,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:45:47,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:45:50,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:45:53,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:54,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:54,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:55,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=166186.66666666666, ans=0.07 2023-09-28 22:45:56,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:45:56,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:58,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:06,460 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.312e+02 2.641e+02 2.987e+02 4.263e+02, threshold=5.283e+02, percent-clipped=0.0 2023-09-28 22:46:06,574 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 22:46:11,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:11,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:12,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:46:12,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:12,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:46:14,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:15,524 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.64 vs. limit=10.0 2023-09-28 22:46:16,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 22:46:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:18,622 INFO [train.py:1039] (0/4) Epoch 5, batch 3700, loss[loss=0.2041, simple_loss=0.2711, pruned_loss=0.06859, over 24556.00 frames. ], tot_loss[loss=0.2451, simple_loss=0.3053, pruned_loss=0.09245, over 4721014.85 frames. ], batch size: 60, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:46:18,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:46:19,095 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=166320.0, ans=0.125 2023-09-28 22:46:21,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:46:21,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:46:26,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:26,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 22:46:26,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:27,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:46:28,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:46:32,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:46:35,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:35,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:37,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:46:37,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:38,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:46:40,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:40,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=166386.66666666666, ans=0.0 2023-09-28 22:46:41,375 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.37 vs. limit=15.0 2023-09-28 22:46:41,647 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.63 vs. limit=5.0 2023-09-28 22:46:41,975 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 22:46:50,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:46:51,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:46:51,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:46:53,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 22:46:53,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:46:58,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:58,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 22:47:00,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:01,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:47:03,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:04,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:47:05,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.22 vs. limit=22.5 2023-09-28 22:47:08,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:47:12,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:47:12,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 22:47:14,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:47:14,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 22:47:19,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:47:19,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:47:19,604 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:47:22,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:23,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 22:47:26,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:47:26,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:47:26,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:26,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:26,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=166586.66666666666, ans=0.0 2023-09-28 22:47:31,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:32,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 22:47:33,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 22:47:33,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:47:33,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:36,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=166586.66666666666, ans=0.5 2023-09-28 22:47:37,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:47:37,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:47:40,442 INFO [train.py:1039] (0/4) Epoch 5, batch 3750, loss[loss=0.2637, simple_loss=0.3232, pruned_loss=0.1021, over 23378.00 frames. ], tot_loss[loss=0.2466, simple_loss=0.3067, pruned_loss=0.09324, over 4724748.80 frames. ], batch size: 105, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:47:40,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:40,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=166653.33333333334, ans=0.0 2023-09-28 22:47:42,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:47:45,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:47:47,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 22:47:47,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:47:50,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:47:50,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 22:47:52,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:47:53,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:47:58,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:02,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:48:03,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:48:05,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:48:10,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:11,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 22:48:12,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:15,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:15,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=166786.66666666666, ans=0.1 2023-09-28 22:48:16,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:20,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 22:48:22,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 22:48:23,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:25,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:25,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:27,381 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=166786.66666666666, ans=0.125 2023-09-28 22:48:31,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:31,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:48:37,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 22:48:37,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=166853.33333333334, ans=0.125 2023-09-28 22:48:40,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:41,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:48:41,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:48:46,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:48:49,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:48:49,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=166920.0, ans=0.125 2023-09-28 22:48:51,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:48:52,778 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.387e+02 2.666e+02 3.325e+02 5.060e+02, threshold=5.333e+02, percent-clipped=0.0 2023-09-28 22:48:52,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:48:54,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:48:58,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:49:04,231 INFO [train.py:1039] (0/4) Epoch 5, batch 3800, loss[loss=0.2432, simple_loss=0.3082, pruned_loss=0.08913, over 24618.00 frames. ], tot_loss[loss=0.2471, simple_loss=0.3073, pruned_loss=0.09349, over 4726041.99 frames. ], batch size: 65, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:49:07,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:49:11,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:13,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:49:13,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 22:49:14,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:16,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:19,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:49:22,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 22:49:22,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:22,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:49:25,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:27,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:49:27,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:27,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 22:49:31,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:49:32,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:49:33,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:35,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:49:37,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:49:38,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:49:38,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:41,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:41,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:48,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:49:48,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 22:49:50,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:49:58,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:00,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=167186.66666666666, ans=0.125 2023-09-28 22:50:04,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:50:06,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 22:50:10,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 22:50:10,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:13,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:50:13,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:14,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 22:50:19,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 22:50:19,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 22:50:20,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:20,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:25,467 INFO [train.py:1039] (0/4) Epoch 5, batch 3850, loss[loss=0.2341, simple_loss=0.3052, pruned_loss=0.08151, over 24526.00 frames. ], tot_loss[loss=0.2457, simple_loss=0.3057, pruned_loss=0.09282, over 4724049.99 frames. ], batch size: 66, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:50:25,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:50:25,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:50:31,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:50:32,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 22:50:34,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:50:34,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:37,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:50:39,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:41,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:50:41,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=167386.66666666666, ans=0.0 2023-09-28 22:50:43,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 22:50:43,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=167386.66666666666, ans=0.1 2023-09-28 22:50:49,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:52,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:54,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:50:54,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:50:54,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=167386.66666666666, ans=0.125 2023-09-28 22:50:58,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=167453.33333333334, ans=0.125 2023-09-28 22:50:59,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:59,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:51:01,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:01,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:51:01,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:04,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:06,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:06,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:51:07,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 22:51:07,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 22:51:09,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:09,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:09,929 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.96 vs. limit=22.5 2023-09-28 22:51:11,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:12,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:12,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 22:51:17,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 22:51:19,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:20,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 22:51:22,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:51:24,978 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.59 vs. limit=22.5 2023-09-28 22:51:28,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:30,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:35,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:35,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 22:51:37,561 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.762e+02 2.127e+02 2.578e+02 3.001e+02 5.626e+02, threshold=5.156e+02, percent-clipped=1.0 2023-09-28 22:51:37,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 22:51:40,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:40,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:45,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:51:45,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:51:45,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:51:46,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 22:51:48,208 INFO [train.py:1039] (0/4) Epoch 5, batch 3900, loss[loss=0.2373, simple_loss=0.2975, pruned_loss=0.08854, over 23764.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3036, pruned_loss=0.09233, over 4696674.15 frames. ], batch size: 179, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:51:48,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:48,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 22:51:49,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:50,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:52,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:51:52,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:53,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:51:55,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:55,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:55,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=167653.33333333334, ans=0.0 2023-09-28 22:51:56,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:51:56,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 22:51:56,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:59,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:01,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:01,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:52:02,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:04,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:04,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:08,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:52:09,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 22:52:09,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:11,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 22:52:13,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:13,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 22:52:16,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 22:52:21,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:21,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:52:21,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:52:22,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:52:25,072 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=15.0 2023-09-28 22:52:26,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:29,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:52:31,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:52:31,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:52:32,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:52:33,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=167786.66666666666, ans=0.0 2023-09-28 22:52:38,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:38,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:52:47,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:52:49,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:52:56,696 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-09-28 22:52:59,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:02,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:02,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 22:53:02,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 22:53:02,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:05,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 22:53:07,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:53:07,859 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=167920.0, ans=0.125 2023-09-28 22:53:08,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 22:53:10,444 INFO [train.py:1039] (0/4) Epoch 5, batch 3950, loss[loss=0.2426, simple_loss=0.3125, pruned_loss=0.0863, over 24326.00 frames. ], tot_loss[loss=0.2436, simple_loss=0.3038, pruned_loss=0.09174, over 4716666.74 frames. ], batch size: 74, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:53:16,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:53:16,896 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.59 vs. limit=12.0 2023-09-28 22:53:17,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 22:53:17,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:53:19,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:53:21,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:53:28,021 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 22:53:28,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:28,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 22:53:28,245 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 22:53:29,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:32,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:34,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:53:34,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:35,222 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=168053.33333333334, ans=0.125 2023-09-28 22:53:36,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 22:53:39,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:53:39,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:39,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:53:41,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:53:42,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:53:55,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:53:56,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:53:57,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=168120.0, ans=0.0 2023-09-28 22:54:00,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 22:54:07,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 22:54:07,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 22:54:07,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=168186.66666666666, ans=0.0 2023-09-28 22:54:08,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:54:08,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:54:09,755 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.84 vs. limit=12.0 2023-09-28 22:54:17,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:54:17,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:54:19,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:54:20,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:54:20,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 22:54:21,768 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.253e+02 2.651e+02 3.133e+02 5.052e+02, threshold=5.303e+02, percent-clipped=0.0 2023-09-28 22:54:24,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:54:25,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:54:27,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=168253.33333333334, ans=0.1 2023-09-28 22:54:30,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 22:54:33,579 INFO [train.py:1039] (0/4) Epoch 5, batch 4000, loss[loss=0.2505, simple_loss=0.3137, pruned_loss=0.0936, over 24370.00 frames. ], tot_loss[loss=0.2449, simple_loss=0.3046, pruned_loss=0.09263, over 4707307.36 frames. ], batch size: 77, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:54:40,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:47,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:51,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:54:53,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:54:54,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:54,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 22:54:54,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:54:56,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 22:54:56,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:54:56,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 22:54:58,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:01,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:55:03,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:03,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:55:03,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:03,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:55:04,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:55:06,480 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 22:55:07,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:55:09,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:13,048 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 22:55:13,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:55:13,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:18,471 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.41 vs. limit=15.0 2023-09-28 22:55:21,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=168520.0, ans=0.025 2023-09-28 22:55:22,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 22:55:22,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:25,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:55:25,811 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 22:55:27,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:55:27,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 22:55:27,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:55:28,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:30,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:55:32,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:55:32,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:55:32,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:34,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 22:55:34,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:35,964 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 22:55:41,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:55:45,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:55:49,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:55:49,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:49,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:55:51,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:56,184 INFO [train.py:1039] (0/4) Epoch 5, batch 4050, loss[loss=0.2438, simple_loss=0.3044, pruned_loss=0.09162, over 18523.00 frames. ], tot_loss[loss=0.2473, simple_loss=0.3065, pruned_loss=0.09406, over 4704498.95 frames. ], batch size: 40, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:55:56,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:57,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:55:59,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 22:55:59,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:56:01,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:02,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:56:04,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:06,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:06,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=168653.33333333334, ans=0.0 2023-09-28 22:56:09,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:13,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:13,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:56:16,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:56:16,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:56:20,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:21,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:24,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 22:56:27,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 22:56:27,341 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 22:56:30,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:56:36,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 22:56:38,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:56:41,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:44,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:46,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:56:46,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:50,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:54,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 22:56:54,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:56:56,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:56:56,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 22:57:02,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:57:05,947 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.62 vs. limit=15.0 2023-09-28 22:57:08,222 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.148e+02 2.565e+02 3.123e+02 5.245e+02, threshold=5.130e+02, percent-clipped=0.0 2023-09-28 22:57:08,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 22:57:08,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:08,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:57:10,142 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:57:12,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 22:57:12,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 22:57:12,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:15,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:15,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=168920.0, ans=0.125 2023-09-28 22:57:15,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=168920.0, ans=0.0 2023-09-28 22:57:16,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:16,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:57:20,101 INFO [train.py:1039] (0/4) Epoch 5, batch 4100, loss[loss=0.2296, simple_loss=0.2926, pruned_loss=0.08325, over 24461.00 frames. ], tot_loss[loss=0.2479, simple_loss=0.307, pruned_loss=0.09443, over 4705798.14 frames. ], batch size: 58, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:57:23,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 22:57:24,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 22:57:27,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 22:57:28,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 22:57:29,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:29,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:29,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:29,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:57:31,552 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 22:57:35,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:35,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:57:35,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:37,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:57:40,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:57:41,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:41,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:57:41,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 22:57:43,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:43,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:57:43,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:45,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:57:45,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 22:57:48,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:57:51,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 22:57:52,564 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.59 vs. limit=15.0 2023-09-28 22:57:53,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:55,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:55,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 22:57:57,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:58,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:57:58,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:57:59,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=169120.0, ans=0.1 2023-09-28 22:58:01,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 22:58:02,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:58:04,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:58:07,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 22:58:08,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:58:09,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:11,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:17,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:17,662 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=169186.66666666666, ans=0.125 2023-09-28 22:58:20,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:22,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:58:24,693 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=169253.33333333334, ans=0.125 2023-09-28 22:58:26,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=169253.33333333334, ans=0.125 2023-09-28 22:58:31,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:58:31,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:34,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:37,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:58:42,153 INFO [train.py:1039] (0/4) Epoch 5, batch 4150, loss[loss=0.2609, simple_loss=0.3024, pruned_loss=0.1097, over 23788.00 frames. ], tot_loss[loss=0.248, simple_loss=0.307, pruned_loss=0.09454, over 4710952.96 frames. ], batch size: 164, lr: 1.94e-02, grad_scale: 32.0 2023-09-28 22:58:44,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:44,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:58:44,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=169320.0, ans=0.0 2023-09-28 22:58:44,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=169320.0, ans=0.0 2023-09-28 22:58:46,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:58:47,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:58:49,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 22:58:50,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:50,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 22:58:50,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 22:58:52,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 22:58:53,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:57,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:58:57,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:01,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:03,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:03,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:59:06,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:59:06,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:59:07,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:59:13,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:13,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=169453.33333333334, ans=0.0 2023-09-28 22:59:17,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:19,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 22:59:22,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 22:59:22,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:59:23,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 22:59:23,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:59:23,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:26,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:28,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:30,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 22:59:32,573 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.88 vs. limit=15.0 2023-09-28 22:59:33,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:59:34,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:59:35,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 22:59:36,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:37,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 22:59:40,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:59:41,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:44,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:44,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 22:59:44,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:44,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=169520.0, ans=0.2 2023-09-28 22:59:45,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:59:47,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:59:51,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 22:59:51,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:51,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:59:51,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:59:53,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 22:59:54,524 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.448e+02 2.857e+02 3.478e+02 5.752e+02, threshold=5.715e+02, percent-clipped=2.0 2023-09-28 22:59:54,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:54,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:59:54,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:56,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:57,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 22:59:57,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:00:01,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:00:01,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=169586.66666666666, ans=0.0 2023-09-28 23:00:04,577 INFO [train.py:1039] (0/4) Epoch 5, batch 4200, loss[loss=0.2529, simple_loss=0.2974, pruned_loss=0.1042, over 23812.00 frames. ], tot_loss[loss=0.2464, simple_loss=0.3058, pruned_loss=0.09357, over 4704773.83 frames. ], batch size: 195, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:00:04,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 23:00:06,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:00:09,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:11,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:00:11,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:11,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:11,838 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.94 vs. limit=10.0 2023-09-28 23:00:14,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 23:00:17,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 23:00:17,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:21,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:23,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:00:26,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:00:28,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:00:28,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:28,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 23:00:28,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:28,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:29,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:29,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:00:32,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:00:34,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 23:00:34,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:39,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:00:41,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:00:44,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:00:46,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:00:47,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:00:47,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 23:00:47,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:00:50,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:00:58,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:00:59,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:03,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:01:07,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 23:01:11,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:01:15,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:01:15,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:19,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 23:01:19,582 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.23 vs. limit=22.5 2023-09-28 23:01:26,921 INFO [train.py:1039] (0/4) Epoch 5, batch 4250, loss[loss=0.2331, simple_loss=0.3055, pruned_loss=0.08036, over 24428.00 frames. ], tot_loss[loss=0.2459, simple_loss=0.3047, pruned_loss=0.09356, over 4704269.81 frames. ], batch size: 69, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:01:26,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:01:28,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:28,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:01:31,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:34,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=169986.66666666666, ans=0.125 2023-09-28 23:01:38,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:01:38,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 23:01:38,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:01:43,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:45,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:01:47,429 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.247e-02 2023-09-28 23:01:50,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:50,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:53,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:01:53,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:01:53,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=170053.33333333334, ans=0.2 2023-09-28 23:01:54,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:57,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:58,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:02:01,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:02:01,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:03,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 23:02:03,450 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=170120.0, ans=0.125 2023-09-28 23:02:06,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 23:02:06,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:08,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:08,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:02:10,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:02:10,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:10,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=170120.0, ans=0.125 2023-09-28 23:02:11,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:15,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:02:16,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:02:20,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:23,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:23,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 23:02:23,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:02:25,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 23:02:26,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:02:26,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:02:27,764 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.35 vs. limit=22.5 2023-09-28 23:02:29,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:29,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:02:33,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 23:02:35,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:02:35,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:02:39,734 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.222e+02 2.521e+02 2.962e+02 6.093e+02, threshold=5.043e+02, percent-clipped=1.0 2023-09-28 23:02:39,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:41,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=170253.33333333334, ans=0.1 2023-09-28 23:02:42,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:43,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:02:45,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:47,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:47,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=170253.33333333334, ans=0.5 2023-09-28 23:02:48,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:02:49,895 INFO [train.py:1039] (0/4) Epoch 5, batch 4300, loss[loss=0.2714, simple_loss=0.3191, pruned_loss=0.1119, over 23235.00 frames. ], tot_loss[loss=0.2455, simple_loss=0.3038, pruned_loss=0.09363, over 4695931.38 frames. ], batch size: 105, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:02:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:02:49,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 23:02:51,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:58,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:58,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:01,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:03:08,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:03:08,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 23:03:09,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:03:11,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:03:11,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:03:11,447 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 23:03:13,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=170386.66666666666, ans=0.125 2023-09-28 23:03:15,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:03:18,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:18,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=170386.66666666666, ans=0.07 2023-09-28 23:03:20,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 23:03:21,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:03:21,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 23:03:24,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:03:26,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:03:28,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:03:28,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:30,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:03:30,443 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=170453.33333333334, ans=0.1 2023-09-28 23:03:31,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:31,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:03:31,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 23:03:33,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 23:03:35,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:03:37,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:37,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=170453.33333333334, ans=0.0 2023-09-28 23:03:38,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:03:38,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:38,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:38,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 23:03:38,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 23:03:40,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 23:03:41,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:03:41,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 23:03:41,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 23:03:48,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:50,827 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 23:03:52,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:03:52,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:03:52,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:54,310 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:03:55,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 23:03:57,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:57,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:57,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:03:57,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:03:57,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:03:59,134 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=170586.66666666666, ans=0.0 2023-09-28 23:04:00,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:04:03,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:03,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:05,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:04:10,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 23:04:10,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:04:13,724 INFO [train.py:1039] (0/4) Epoch 5, batch 4350, loss[loss=0.2347, simple_loss=0.3034, pruned_loss=0.08298, over 23402.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3045, pruned_loss=0.09333, over 4710178.03 frames. ], batch size: 93, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:04:15,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:17,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:22,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:04:22,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:04:27,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:04:29,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=170720.0, ans=0.0 2023-09-28 23:04:30,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:34,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:04:34,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:04:37,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:04:39,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:04:40,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:04:41,740 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.05 vs. limit=15.0 2023-09-28 23:04:45,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 23:04:48,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:49,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=170786.66666666666, ans=0.2 2023-09-28 23:04:50,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:55,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:58,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 23:05:00,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:01,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:05:05,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=170853.33333333334, ans=0.035 2023-09-28 23:05:07,283 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 23:05:10,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:10,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:05:12,243 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 23:05:12,363 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 23:05:12,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:12,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:13,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:05:15,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:15,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:16,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:05:18,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 23:05:18,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:18,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:18,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:20,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 23:05:20,719 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 23:05:22,119 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 23:05:22,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 23:05:25,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:05:25,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:05:25,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:25,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:05:27,195 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 2.177e+02 2.511e+02 2.905e+02 5.033e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-28 23:05:28,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 23:05:31,961 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 23:05:31,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:34,522 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.49 vs. limit=15.0 2023-09-28 23:05:35,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:35,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:36,630 INFO [train.py:1039] (0/4) Epoch 5, batch 4400, loss[loss=0.2427, simple_loss=0.3003, pruned_loss=0.09253, over 23725.00 frames. ], tot_loss[loss=0.2448, simple_loss=0.3045, pruned_loss=0.09252, over 4731518.91 frames. ], batch size: 149, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:05:36,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:40,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 23:05:40,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 23:05:42,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 23:05:42,638 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 23:05:43,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:05:43,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:45,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 23:05:48,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:50,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:50,106 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 23:05:50,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=170986.66666666666, ans=0.2 2023-09-28 23:05:54,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:54,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 23:05:55,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=171053.33333333334, ans=0.0 2023-09-28 23:05:56,759 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 23:05:57,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=171053.33333333334, ans=0.0 2023-09-28 23:05:59,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 23:06:00,466 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.05 vs. limit=15.0 2023-09-28 23:06:02,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 23:06:02,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 23:06:02,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:03,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:03,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:05,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:06,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 23:06:06,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 23:06:07,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:08,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:06:08,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:10,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:11,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:11,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 23:06:11,955 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 23:06:16,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:25,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:26,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 23:06:29,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:06:33,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:06:35,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:06:35,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 23:06:37,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:06:37,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:06:37,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:06:37,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:06:42,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 23:06:46,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 23:06:48,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 23:06:48,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:48,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 23:06:49,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:06:52,713 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.38 vs. limit=15.0 2023-09-28 23:06:55,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:06:57,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 23:07:00,194 INFO [train.py:1039] (0/4) Epoch 5, batch 4450, loss[loss=0.2245, simple_loss=0.3043, pruned_loss=0.07234, over 24520.00 frames. ], tot_loss[loss=0.2471, simple_loss=0.3065, pruned_loss=0.09386, over 4728247.55 frames. ], batch size: 66, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:07:01,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:07:03,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:05,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:07:11,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:11,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:07:15,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:18,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:07:23,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:07:23,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:23,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=171386.66666666666, ans=0.05 2023-09-28 23:07:24,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 23:07:24,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:24,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:24,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:07:24,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:07:27,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:07:32,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=171453.33333333334, ans=0.125 2023-09-28 23:07:33,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:36,136 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.73 vs. limit=22.5 2023-09-28 23:07:36,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:37,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:07:40,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:07:42,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 23:07:42,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 23:07:42,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:07:44,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=171453.33333333334, ans=0.125 2023-09-28 23:07:45,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:45,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 23:07:51,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:07:54,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:54,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 23:07:54,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:54,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:07:54,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:07:54,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=171520.0, ans=0.125 2023-09-28 23:07:55,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:58,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:08:03,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:08:03,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 23:08:05,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:08:06,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.67 vs. limit=15.0 2023-09-28 23:08:08,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:10,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:08:11,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:11,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:08:13,316 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.374e+02 2.783e+02 3.317e+02 5.756e+02, threshold=5.567e+02, percent-clipped=2.0 2023-09-28 23:08:15,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:08:18,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 23:08:20,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:08:23,520 INFO [train.py:1039] (0/4) Epoch 5, batch 4500, loss[loss=0.2284, simple_loss=0.3032, pruned_loss=0.07686, over 24591.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.307, pruned_loss=0.09408, over 4725296.98 frames. ], batch size: 71, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:08:25,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:26,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 23:08:26,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 23:08:28,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:31,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=171653.33333333334, ans=0.125 2023-09-28 23:08:33,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:33,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:33,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:08:34,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:08:34,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:34,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:34,932 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=171653.33333333334, ans=0.125 2023-09-28 23:08:40,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=171720.0, ans=0.1 2023-09-28 23:08:47,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:08:52,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:52,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:08:55,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:09:01,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:09:06,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:09:11,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:09:14,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:09:14,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 23:09:15,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:16,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:09:21,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:09:21,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 23:09:21,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:09:21,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:23,938 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.42 vs. limit=12.0 2023-09-28 23:09:28,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:09:28,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:09:31,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:33,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:09:34,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:09:34,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 23:09:37,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 23:09:37,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 23:09:41,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 23:09:44,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 23:09:46,070 INFO [train.py:1039] (0/4) Epoch 5, batch 4550, loss[loss=0.2626, simple_loss=0.3076, pruned_loss=0.1088, over 23693.00 frames. ], tot_loss[loss=0.246, simple_loss=0.3053, pruned_loss=0.09338, over 4713830.07 frames. ], batch size: 149, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:09:46,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:09:50,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:51,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:54,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:09:59,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:10:02,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:10:02,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:02,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:10:02,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:06,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:07,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:10:08,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=172053.33333333334, ans=0.0 2023-09-28 23:10:10,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:11,933 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=15.0 2023-09-28 23:10:12,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 23:10:14,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 23:10:14,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:10:16,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 23:10:18,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=172120.0, ans=0.0 2023-09-28 23:10:20,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 23:10:21,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:24,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 23:10:26,450 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=172120.0, ans=0.125 2023-09-28 23:10:27,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:10:29,266 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=172120.0, ans=0.125 2023-09-28 23:10:30,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:10:33,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 23:10:37,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:40,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:40,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:42,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:44,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 23:10:44,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 23:10:44,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:10:45,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 23:10:46,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=172186.66666666666, ans=0.2 2023-09-28 23:10:48,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 23:10:48,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:49,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:49,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:51,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:51,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:10:52,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:10:54,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 23:10:55,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:55,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:10:56,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 23:10:57,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:10:57,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 23:10:58,982 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.022e+02 2.307e+02 2.730e+02 4.696e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-28 23:10:59,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=172253.33333333334, ans=0.0 2023-09-28 23:11:00,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:11:00,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:11:04,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:11:04,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:11:05,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:11:05,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:11:07,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:11:08,889 INFO [train.py:1039] (0/4) Epoch 5, batch 4600, loss[loss=0.2535, simple_loss=0.3057, pruned_loss=0.1006, over 23802.00 frames. ], tot_loss[loss=0.2446, simple_loss=0.3039, pruned_loss=0.09265, over 4725003.42 frames. ], batch size: 212, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:11:11,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:12,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:11:16,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:11:16,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:11:16,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:19,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 23:11:21,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:11:22,195 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=172320.0, ans=0.0 2023-09-28 23:11:25,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:11:25,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:27,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:28,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=172386.66666666666, ans=0.04949747468305833 2023-09-28 23:11:34,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 23:11:36,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:37,448 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.64 vs. limit=15.0 2023-09-28 23:11:39,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:42,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:11:45,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:51,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 23:11:51,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:11:53,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:11:58,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:58,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:11:58,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=172520.0, ans=0.0 2023-09-28 23:12:02,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:12:04,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=172520.0, ans=0.1 2023-09-28 23:12:05,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 23:12:05,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:12:10,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:12,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:12:13,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:13,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 23:12:14,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:15,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 23:12:15,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:15,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=172586.66666666666, ans=0.2 2023-09-28 23:12:16,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:18,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:18,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:12:20,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:22,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 23:12:23,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 23:12:23,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 23:12:23,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:25,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:25,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:27,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:33,419 INFO [train.py:1039] (0/4) Epoch 5, batch 4650, loss[loss=0.2464, simple_loss=0.3185, pruned_loss=0.08716, over 24563.00 frames. ], tot_loss[loss=0.2448, simple_loss=0.3038, pruned_loss=0.09289, over 4718161.20 frames. ], batch size: 71, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:12:38,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:12:41,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:41,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:43,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:12:43,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:43,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:45,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:48,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 23:12:53,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:12:55,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 23:12:56,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:58,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 23:12:58,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:12:58,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 23:12:58,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 23:12:58,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:59,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:13:03,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:13:03,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:03,717 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 23:13:06,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:08,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 23:13:09,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:09,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:13:12,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 23:13:13,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:13:16,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=172786.66666666666, ans=0.125 2023-09-28 23:13:18,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:13:21,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:25,870 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=172853.33333333334, ans=0.0 2023-09-28 23:13:29,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:31,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:32,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:32,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:13:35,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 23:13:36,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 23:13:36,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 23:13:36,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 23:13:38,629 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:13:39,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:45,874 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.231e+02 2.488e+02 2.964e+02 5.544e+02, threshold=4.977e+02, percent-clipped=2.0 2023-09-28 23:13:48,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:13:48,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:13:48,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 23:13:48,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:49,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:49,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:13:51,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:13:53,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:13:53,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:54,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:56,013 INFO [train.py:1039] (0/4) Epoch 5, batch 4700, loss[loss=0.2624, simple_loss=0.3092, pruned_loss=0.1078, over 23768.00 frames. ], tot_loss[loss=0.2444, simple_loss=0.3043, pruned_loss=0.09223, over 4734516.83 frames. ], batch size: 164, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:13:59,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:59,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:13:59,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:14:01,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:14:01,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:14:02,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 23:14:11,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:11,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:14:13,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:14:13,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:15,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:14:19,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 23:14:19,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 23:14:23,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:23,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:14:24,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:14:26,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:34,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:14:36,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:14:36,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=173120.0, ans=0.1 2023-09-28 23:14:40,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:46,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 23:14:48,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:14:51,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:14:54,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 23:14:54,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:14:57,581 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.13 vs. limit=12.0 2023-09-28 23:14:59,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:15:01,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 23:15:01,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:02,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:05,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:15:06,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:15:08,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 23:15:08,137 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 23:15:10,522 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.34 vs. limit=15.0 2023-09-28 23:15:11,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:12,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 23:15:14,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:19,184 INFO [train.py:1039] (0/4) Epoch 5, batch 4750, loss[loss=0.2382, simple_loss=0.2957, pruned_loss=0.09041, over 23302.00 frames. ], tot_loss[loss=0.2451, simple_loss=0.3049, pruned_loss=0.09265, over 4716828.20 frames. ], batch size: 105, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:15:19,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 23:15:21,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:15:23,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:15:29,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 23:15:29,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:15:34,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 23:15:36,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=173386.66666666666, ans=0.125 2023-09-28 23:15:37,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:15:37,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:39,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:44,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 23:15:49,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:15:51,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 23:15:53,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:55,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:55,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:55,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:57,747 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 23:15:57,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 23:16:01,629 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.63 vs. limit=6.0 2023-09-28 23:16:02,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 23:16:05,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:07,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:09,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:16:09,083 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 23:16:09,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:12,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:16:14,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:16:17,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 23:16:17,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 23:16:19,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:16:19,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:16:19,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:20,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:16:20,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 23:16:24,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 23:16:24,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=173586.66666666666, ans=0.07 2023-09-28 23:16:25,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:27,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:16:27,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 23:16:29,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:16:31,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:32,602 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.125e+02 2.370e+02 2.784e+02 4.798e+02, threshold=4.741e+02, percent-clipped=0.0 2023-09-28 23:16:32,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:16:33,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=173586.66666666666, ans=0.2 2023-09-28 23:16:34,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:34,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:16:39,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:39,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 23:16:41,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 23:16:42,589 INFO [train.py:1039] (0/4) Epoch 5, batch 4800, loss[loss=0.2763, simple_loss=0.3238, pruned_loss=0.1144, over 23490.00 frames. ], tot_loss[loss=0.2479, simple_loss=0.307, pruned_loss=0.09443, over 4708890.87 frames. ], batch size: 285, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:16:42,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 23:16:45,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:16:45,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:48,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 23:16:53,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:55,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:59,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:17:02,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:02,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:02,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 23:17:02,696 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=173720.0, ans=0.125 2023-09-28 23:17:03,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:17:03,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:17:05,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:17:12,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:12,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:12,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:17:12,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=173720.0, ans=0.125 2023-09-28 23:17:16,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:16,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:17:16,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:17,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:20,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:23,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:17:26,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:17:28,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:32,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 23:17:32,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 23:17:32,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:32,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:17:33,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:17:33,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:33,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:17:34,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:17:35,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:37,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:39,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=173853.33333333334, ans=0.125 2023-09-28 23:17:41,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:42,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:17:48,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 23:17:48,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:48,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:49,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:17:49,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:53,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:54,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:17:54,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:54,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:17:54,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:17:56,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:17:58,352 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=173920.0, ans=0.125 2023-09-28 23:18:00,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:00,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:00,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:18:01,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 23:18:04,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 23:18:04,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:04,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:06,218 INFO [train.py:1039] (0/4) Epoch 5, batch 4850, loss[loss=0.2385, simple_loss=0.3149, pruned_loss=0.08105, over 24307.00 frames. ], tot_loss[loss=0.248, simple_loss=0.3074, pruned_loss=0.09428, over 4718746.90 frames. ], batch size: 74, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:18:06,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:06,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:09,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:18:19,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 23:18:21,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:27,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:29,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:18:29,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:32,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:32,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:18:34,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:18:34,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 23:18:34,599 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=174053.33333333334, ans=0.0 2023-09-28 23:18:39,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:42,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:18:42,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:18:44,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:18:44,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 23:18:46,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:46,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:49,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:49,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 23:18:50,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 23:18:51,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=174120.0, ans=0.125 2023-09-28 23:18:53,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:19:01,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:19:02,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 23:19:02,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:19:02,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:19:03,374 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.96 vs. limit=15.0 2023-09-28 23:19:04,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:19:06,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 23:19:06,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:06,840 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.92 vs. limit=6.0 2023-09-28 23:19:07,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 23:19:08,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:10,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:10,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=174253.33333333334, ans=0.1 2023-09-28 23:19:12,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 23:19:14,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=174253.33333333334, ans=0.125 2023-09-28 23:19:18,959 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.747e+02 2.375e+02 2.676e+02 3.229e+02 5.316e+02, threshold=5.352e+02, percent-clipped=3.0 2023-09-28 23:19:22,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:22,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=174253.33333333334, ans=0.2 2023-09-28 23:19:28,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:19:28,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:30,254 INFO [train.py:1039] (0/4) Epoch 5, batch 4900, loss[loss=0.2521, simple_loss=0.318, pruned_loss=0.09305, over 23199.00 frames. ], tot_loss[loss=0.2466, simple_loss=0.3058, pruned_loss=0.0937, over 4722811.75 frames. ], batch size: 93, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:19:33,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 23:19:33,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:19:33,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=174320.0, ans=0.1 2023-09-28 23:19:39,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:40,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:42,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:19:44,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 23:19:48,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=174386.66666666666, ans=0.125 2023-09-28 23:19:49,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 23:19:54,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 23:19:55,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 23:19:55,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:19:55,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:55,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:19:55,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:55,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:19:57,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 23:20:01,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 23:20:01,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:20:02,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:20:05,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:20:08,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:20:08,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:09,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:09,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 23:20:11,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:20:12,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:20:12,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 23:20:12,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 23:20:17,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 23:20:18,274 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.63 vs. limit=15.0 2023-09-28 23:20:19,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:20:21,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:20:22,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:20:22,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:22,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:20:22,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:20:22,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 23:20:24,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:27,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:20:30,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:20:32,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=174520.0, ans=0.0 2023-09-28 23:20:34,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 23:20:36,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:20:37,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:20:37,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 23:20:44,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:44,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:20:44,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 23:20:44,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:20:46,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:20:46,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:48,127 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=174586.66666666666, ans=0.125 2023-09-28 23:20:52,896 INFO [train.py:1039] (0/4) Epoch 5, batch 4950, loss[loss=0.2551, simple_loss=0.3078, pruned_loss=0.1012, over 23652.00 frames. ], tot_loss[loss=0.245, simple_loss=0.3044, pruned_loss=0.09275, over 4731161.32 frames. ], batch size: 232, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:20:53,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:20:53,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:20:54,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:54,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 23:20:56,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:20:58,961 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.91 vs. limit=12.0 2023-09-28 23:20:59,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:20:59,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:21:01,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 23:21:01,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 23:21:01,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:21:02,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 23:21:02,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:02,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:21:02,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:21:03,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:06,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:06,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:21:10,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:21:12,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:21:12,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:12,873 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=15.0 2023-09-28 23:21:13,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:21:16,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:21:21,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:25,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:21:26,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:28,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:28,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:21:31,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 23:21:31,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 23:21:33,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:34,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:21:34,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:21:37,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:21:37,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:21:37,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:21:41,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:42,955 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.94 vs. limit=10.0 2023-09-28 23:21:43,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:21:44,043 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=174853.33333333334, ans=0.125 2023-09-28 23:21:45,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:21:46,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:47,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=174853.33333333334, ans=0.2 2023-09-28 23:21:48,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:48,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 23:21:49,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:21:51,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:21:55,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:21:56,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:21:56,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:21:56,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:57,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:21:57,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:21:59,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:21:59,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:22:01,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:22:03,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 23:22:05,864 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.824e+02 2.215e+02 2.550e+02 3.115e+02 4.856e+02, threshold=5.099e+02, percent-clipped=0.0 2023-09-28 23:22:07,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:12,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 23:22:12,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:22:16,396 INFO [train.py:1039] (0/4) Epoch 5, batch 5000, loss[loss=0.2279, simple_loss=0.2911, pruned_loss=0.08237, over 24435.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.3036, pruned_loss=0.09234, over 4725643.91 frames. ], batch size: 58, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:22:21,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:22:21,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:22,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 23:22:23,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 23:22:26,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:22:28,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 23:22:29,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:22:29,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:22:29,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 23:22:31,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:31,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:22:33,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 23:22:33,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:33,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:22:34,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 23:22:34,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 23:22:36,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:22:36,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 23:22:36,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:22:38,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:38,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:22:38,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 23:22:38,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 23:22:39,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 23:22:39,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:41,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:42,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 23:22:42,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:45,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:47,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:48,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:22:50,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 23:22:50,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:22:52,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:22:57,019 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 23:22:59,199 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.77 vs. limit=22.5 2023-09-28 23:23:00,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:23:01,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:23:01,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:04,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 23:23:04,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:23:04,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:05,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:06,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 23:23:08,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:11,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:13,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:13,762 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.94 vs. limit=22.5 2023-09-28 23:23:19,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 23:23:25,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:28,973 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.09 vs. limit=22.5 2023-09-28 23:23:34,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:36,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:36,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:23:36,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:36,626 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:23:37,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:23:37,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:23:37,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:39,240 INFO [train.py:1039] (0/4) Epoch 5, batch 5050, loss[loss=0.2625, simple_loss=0.3279, pruned_loss=0.0986, over 24310.00 frames. ], tot_loss[loss=0.2445, simple_loss=0.3043, pruned_loss=0.09238, over 4738071.53 frames. ], batch size: 74, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:23:42,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:44,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 23:23:44,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:23:47,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:49,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:23:49,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 23:23:50,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:50,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:53,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:23:55,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:23:55,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:24:04,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 23:24:06,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:24:08,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:08,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 23:24:09,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:11,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:11,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:24:11,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:24:11,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 23:24:12,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 23:24:13,853 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.26 vs. limit=12.0 2023-09-28 23:24:14,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:16,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=175453.33333333334, ans=0.125 2023-09-28 23:24:17,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:19,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=175453.33333333334, ans=0.1 2023-09-28 23:24:21,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:21,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 23:24:22,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:25,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 23:24:27,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:24:28,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=175520.0, ans=0.0 2023-09-28 23:24:29,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:24:29,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:31,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:32,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:24:34,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:24:35,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:35,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:24:35,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:24:36,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 23:24:37,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:24:39,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:43,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:43,275 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 23:24:43,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:24:46,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:24:47,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:47,470 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 23:24:50,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:50,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 23:24:50,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:51,913 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.101e+02 2.649e+02 3.047e+02 4.508e+02, threshold=5.297e+02, percent-clipped=0.0 2023-09-28 23:24:55,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:57,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:57,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 23:24:57,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 23:25:00,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:00,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:01,955 INFO [train.py:1039] (0/4) Epoch 5, batch 5100, loss[loss=0.2335, simple_loss=0.2998, pruned_loss=0.08363, over 24660.00 frames. ], tot_loss[loss=0.2447, simple_loss=0.3051, pruned_loss=0.09215, over 4735392.12 frames. ], batch size: 65, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:25:02,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:25:04,259 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 23:25:07,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:25:09,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 23:25:11,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 23:25:11,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:12,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:25:15,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:25:17,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 23:25:17,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 23:25:18,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=175720.0, ans=0.125 2023-09-28 23:25:20,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:25:22,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:25:25,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:29,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 23:25:30,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:32,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:25:32,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:25:34,450 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.02 vs. limit=15.0 2023-09-28 23:25:35,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 23:25:39,752 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 23:25:42,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:42,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 23:25:42,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 23:25:47,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:48,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=175786.66666666666, ans=0.125 2023-09-28 23:25:49,552 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.04 vs. limit=15.0 2023-09-28 23:25:56,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:25:59,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 23:25:59,998 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 23:26:01,396 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 23:26:03,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 23:26:03,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:26:05,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 23:26:07,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=175920.0, ans=0.125 2023-09-28 23:26:08,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=175920.0, ans=0.1 2023-09-28 23:26:10,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 23:26:11,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:26:13,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:26:15,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 23:26:18,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:26:18,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 23:26:23,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=175986.66666666666, ans=0.0 2023-09-28 23:26:24,728 INFO [train.py:1039] (0/4) Epoch 5, batch 5150, loss[loss=0.2448, simple_loss=0.313, pruned_loss=0.08831, over 23878.00 frames. ], tot_loss[loss=0.2449, simple_loss=0.3053, pruned_loss=0.09229, over 4740316.86 frames. ], batch size: 86, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:26:24,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:26:24,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:26:24,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:26:25,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:26:25,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:26:26,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:26:26,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 23:26:26,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 23:26:28,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 23:26:28,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:26:28,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 23:26:30,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:32,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:26:32,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=175986.66666666666, ans=0.0 2023-09-28 23:26:33,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:35,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:40,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:26:40,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 23:26:41,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:41,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:26:44,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:26:44,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:26:44,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:26:46,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:26:46,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:26:46,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 23:26:46,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:26:48,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:26:48,968 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=176053.33333333334, ans=0.125 2023-09-28 23:26:50,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:26:52,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 23:26:53,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:26:59,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:27:01,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 23:27:05,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:27:05,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=176120.0, ans=0.0 2023-09-28 23:27:11,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:13,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:16,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:16,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:19,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 23:27:24,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:27:24,878 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=176186.66666666666, ans=0.125 2023-09-28 23:27:26,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:27:26,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:27:30,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:31,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:32,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 23:27:36,509 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.87 vs. limit=15.0 2023-09-28 23:27:37,268 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.252e+02 2.481e+02 2.857e+02 3.938e+02, threshold=4.962e+02, percent-clipped=0.0 2023-09-28 23:27:38,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:40,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:27:42,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:42,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:27:44,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:27:44,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:27:44,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:27:44,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:27:47,090 INFO [train.py:1039] (0/4) Epoch 5, batch 5200, loss[loss=0.2077, simple_loss=0.272, pruned_loss=0.07171, over 24270.00 frames. ], tot_loss[loss=0.2457, simple_loss=0.3059, pruned_loss=0.09274, over 4732870.62 frames. ], batch size: 56, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:27:49,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:27:51,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:27:55,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:27:57,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=176320.0, ans=0.02 2023-09-28 23:28:01,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 23:28:01,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:28:02,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:03,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:04,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:28:06,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:06,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=176386.66666666666, ans=0.125 2023-09-28 23:28:07,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 23:28:09,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:28:09,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:12,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 23:28:16,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:28:17,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:28:17,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 23:28:17,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 23:28:22,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 23:28:22,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:22,315 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 23:28:22,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:25,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:25,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:28:27,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 23:28:27,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:28:30,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:34,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 23:28:34,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 23:28:34,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 23:28:40,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 23:28:40,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:28:47,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:28:47,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:28:48,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 23:28:48,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:48,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:28:48,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:50,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:28:54,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:28:55,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:28:58,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:29:00,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:00,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:04,557 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=176586.66666666666, ans=0.125 2023-09-28 23:29:06,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=176586.66666666666, ans=0.0 2023-09-28 23:29:07,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:07,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 23:29:08,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:29:08,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:29:09,848 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.97 vs. limit=15.0 2023-09-28 23:29:10,858 INFO [train.py:1039] (0/4) Epoch 5, batch 5250, loss[loss=0.2615, simple_loss=0.3217, pruned_loss=0.1006, over 23655.00 frames. ], tot_loss[loss=0.2445, simple_loss=0.3042, pruned_loss=0.09243, over 4719393.12 frames. ], batch size: 85, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:29:11,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:11,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:29:11,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:29:14,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:29:17,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:17,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:29:18,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:29:20,689 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=176653.33333333334, ans=0.125 2023-09-28 23:29:24,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:27,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:29:28,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:29:29,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:29:31,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 23:29:31,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:34,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:36,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=176720.0, ans=0.0 2023-09-28 23:29:52,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=176786.66666666666, ans=0.0 2023-09-28 23:30:03,305 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=176853.33333333334, ans=0.125 2023-09-28 23:30:16,423 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.340e+02 2.624e+02 3.163e+02 5.259e+02, threshold=5.248e+02, percent-clipped=2.0 2023-09-28 23:30:22,229 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=176920.0, ans=0.0 2023-09-28 23:30:24,705 INFO [train.py:1039] (0/4) Epoch 5, batch 5300, loss[loss=0.2166, simple_loss=0.2803, pruned_loss=0.07638, over 24326.00 frames. ], tot_loss[loss=0.2433, simple_loss=0.3028, pruned_loss=0.09193, over 4717442.04 frames. ], batch size: 56, lr: 1.90e-02, grad_scale: 32.0 2023-09-28 23:30:35,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=176986.66666666666, ans=0.125 2023-09-28 23:30:40,066 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-5.pt 2023-09-28 23:30:45,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:30:45,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 23:30:45,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 23:30:45,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:46,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:46,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:46,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:46,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:46,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:30:46,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:46,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:30:46,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:30:47,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 23:30:47,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 23:30:47,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 23:30:47,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:30:47,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 23:30:47,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 23:30:47,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:48,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:48,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:48,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:48,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:30:49,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:49,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:49,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:49,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:49,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:49,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:30:49,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:49,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:30:50,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 23:30:50,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:51,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:51,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 23:30:51,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 23:30:51,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:30:51,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:30:51,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 23:30:51,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 23:30:51,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:52,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:30:53,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:53,304 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 23:30:53,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 23:30:53,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:30:53,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:53,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 23:30:53,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 23:30:53,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 23:30:54,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:57,107 INFO [train.py:1039] (0/4) Epoch 6, batch 0, loss[loss=0.253, simple_loss=0.3224, pruned_loss=0.09182, over 24539.00 frames. ], tot_loss[loss=0.253, simple_loss=0.3224, pruned_loss=0.09182, over 24539.00 frames. ], batch size: 71, lr: 1.78e-02, grad_scale: 32.0 2023-09-28 23:30:57,107 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-28 23:31:12,848 INFO [train.py:1071] (0/4) Epoch 6, validation: loss=0.2892, simple_loss=0.2993, pruned_loss=0.1395, over 1125622.00 frames. 2023-09-28 23:31:12,848 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-28 23:31:16,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 23:31:16,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:31:18,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:31:24,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:24,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:31:24,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:24,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 23:31:26,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 23:31:27,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:29,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:32,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:32,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:34,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:31:34,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:34,390 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=177133.33333333334, ans=0.0 2023-09-28 23:31:35,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 23:31:38,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:40,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=177133.33333333334, ans=0.0 2023-09-28 23:31:44,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:31:44,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:49,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 23:31:53,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:31:53,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:31:55,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:31:55,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=177200.0, ans=0.0 2023-09-28 23:32:00,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:32:04,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:32:04,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=177266.66666666666, ans=0.0 2023-09-28 23:32:10,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 23:32:11,323 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.00 vs. limit=15.0 2023-09-28 23:32:12,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 23:32:13,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:13,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:15,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:32:17,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:32:17,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 23:32:19,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:23,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:26,041 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=177333.33333333334, ans=0.015 2023-09-28 23:32:26,702 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-09-28 23:32:27,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:32:27,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=177333.33333333334, ans=0.0 2023-09-28 23:32:32,013 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 23:32:33,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:32:34,993 INFO [train.py:1039] (0/4) Epoch 6, batch 50, loss[loss=0.2766, simple_loss=0.3218, pruned_loss=0.1157, over 23809.00 frames. ], tot_loss[loss=0.2431, simple_loss=0.3051, pruned_loss=0.09054, over 1075029.86 frames. ], batch size: 164, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:32:38,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:41,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:41,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 23:32:41,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:32:42,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:32:44,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:46,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:49,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:49,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=177466.66666666666, ans=0.125 2023-09-28 23:32:55,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 23:32:55,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:32:55,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=177466.66666666666, ans=0.0 2023-09-28 23:33:02,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:33:04,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 23:33:07,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 23:33:08,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:33:08,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:10,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:10,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=177533.33333333334, ans=0.125 2023-09-28 23:33:11,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:11,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:33:13,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:33:13,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:18,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=177533.33333333334, ans=0.125 2023-09-28 23:33:21,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:22,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:33:24,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:33:25,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 23:33:26,831 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.186e+02 2.592e+02 3.142e+02 7.850e+02, threshold=5.184e+02, percent-clipped=2.0 2023-09-28 23:33:27,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:33:28,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:33:28,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 23:33:29,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:30,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 23:33:30,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:30,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:39,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:33:39,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:42,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:42,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=177666.66666666666, ans=0.125 2023-09-28 23:33:43,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:43,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:47,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 23:33:47,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 23:33:47,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:48,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:50,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:50,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:51,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 23:33:51,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 23:33:54,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:33:54,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:54,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:33:56,337 INFO [train.py:1039] (0/4) Epoch 6, batch 100, loss[loss=0.2299, simple_loss=0.3, pruned_loss=0.07986, over 24649.00 frames. ], tot_loss[loss=0.242, simple_loss=0.3055, pruned_loss=0.08928, over 1897980.22 frames. ], batch size: 65, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:33:56,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 23:33:56,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 23:33:58,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:58,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:01,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:34:01,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:34:02,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:34:05,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:34:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:11,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 23:34:12,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:34:17,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:34:17,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:17,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:17,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:34:19,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:19,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 23:34:21,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:34:21,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:21,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:21,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:25,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 23:34:25,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:27,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:28,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:34:30,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:34:31,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=177866.66666666666, ans=0.125 2023-09-28 23:34:34,201 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 23:34:34,228 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 23:34:37,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:34:37,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:34:40,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:34:42,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:43,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:51,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:53,448 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 23:34:54,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:34:58,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:34:59,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:01,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:05,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:05,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=178000.0, ans=0.125 2023-09-28 23:35:07,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:08,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:35:10,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:11,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:14,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:15,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:35:15,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:16,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 23:35:18,628 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 23:35:18,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:18,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:35:20,641 INFO [train.py:1039] (0/4) Epoch 6, batch 150, loss[loss=0.2579, simple_loss=0.3273, pruned_loss=0.09428, over 24414.00 frames. ], tot_loss[loss=0.243, simple_loss=0.3061, pruned_loss=0.08998, over 2537490.17 frames. ], batch size: 77, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:35:20,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:20,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:20,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:35:22,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:35:22,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:35:22,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:22,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:24,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:24,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:35:24,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:35:25,012 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=178066.66666666666, ans=0.0 2023-09-28 23:35:25,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=178066.66666666666, ans=0.125 2023-09-28 23:35:27,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:31,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:31,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:35:31,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:34,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:34,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=178066.66666666666, ans=0.125 2023-09-28 23:35:35,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:37,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:35:38,186 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.59 vs. limit=22.5 2023-09-28 23:35:39,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:41,431 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.13 vs. limit=22.5 2023-09-28 23:35:42,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 23:35:42,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 23:35:42,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 23:35:47,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:35:47,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:35:47,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:48,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:48,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:50,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:50,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:54,350 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 23:35:55,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:01,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:04,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:36:06,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 23:36:09,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:36:09,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:10,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:11,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=178266.66666666666, ans=10.0 2023-09-28 23:36:12,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:36:13,833 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.160e+02 2.435e+02 3.119e+02 4.742e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-28 23:36:15,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:36:15,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:36:16,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:17,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 23:36:23,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:25,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:25,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:36:25,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:36:28,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:31,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 23:36:33,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:36:34,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:36:36,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:38,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:36:38,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 23:36:38,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:38,551 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 23:36:42,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:43,669 INFO [train.py:1039] (0/4) Epoch 6, batch 200, loss[loss=0.2219, simple_loss=0.2892, pruned_loss=0.07727, over 24351.00 frames. ], tot_loss[loss=0.2443, simple_loss=0.3063, pruned_loss=0.0911, over 3025015.43 frames. ], batch size: 61, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:36:46,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:36:46,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:36:48,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 23:36:50,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:50,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:50,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=178400.0, ans=0.125 2023-09-28 23:36:51,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 23:36:53,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:36:54,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:56,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:56,991 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-09-28 23:37:01,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:37:01,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:37:01,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:10,086 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=178466.66666666666, ans=0.1 2023-09-28 23:37:14,098 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.57 vs. limit=12.0 2023-09-28 23:37:18,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=178533.33333333334, ans=0.125 2023-09-28 23:37:24,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:37:24,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:37:24,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:37:25,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:37:26,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 23:37:26,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:37:29,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:30,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:37:32,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:32,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:37:32,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 23:37:32,546 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=178600.0, ans=0.0 2023-09-28 23:37:33,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:37:33,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:34,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=178600.0, ans=0.0 2023-09-28 23:37:39,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:37:44,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:45,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=178600.0, ans=0.0 2023-09-28 23:37:46,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=178600.0, ans=0.125 2023-09-28 23:37:51,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:53,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:37:56,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=178666.66666666666, ans=0.0 2023-09-28 23:37:59,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:02,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 23:38:02,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:02,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:38:02,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:04,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:38:05,648 INFO [train.py:1039] (0/4) Epoch 6, batch 250, loss[loss=0.2306, simple_loss=0.2879, pruned_loss=0.08663, over 20586.00 frames. ], tot_loss[loss=0.2446, simple_loss=0.3056, pruned_loss=0.09176, over 3394699.91 frames. ], batch size: 44, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:38:07,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 23:38:07,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:38:07,515 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=178733.33333333334, ans=0.125 2023-09-28 23:38:08,674 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 23:38:10,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:12,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:38:12,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:12,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:17,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:38:17,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:19,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:38:24,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:38:24,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=178800.0, ans=0.1 2023-09-28 23:38:25,044 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.63 vs. limit=6.0 2023-09-28 23:38:35,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:37,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:37,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=178866.66666666666, ans=0.0 2023-09-28 23:38:39,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:38:46,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:38:46,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:38:47,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:38:47,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:49,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:38:49,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:38:51,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:52,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:38:54,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=178933.33333333334, ans=0.125 2023-09-28 23:38:56,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 23:38:56,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:57,093 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.60 vs. limit=6.0 2023-09-28 23:38:57,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:38:59,141 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.163e+02 2.470e+02 2.985e+02 4.206e+02, threshold=4.941e+02, percent-clipped=0.0 2023-09-28 23:38:59,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:38:59,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:39:01,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:02,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:39:02,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:39:04,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:06,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:39:06,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:09,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:39:09,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=178933.33333333334, ans=0.0 2023-09-28 23:39:12,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:14,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:39:16,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=179000.0, ans=0.125 2023-09-28 23:39:21,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:23,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:39:26,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 23:39:26,737 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=179000.0, ans=0.0 2023-09-28 23:39:26,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=179000.0, ans=0.0 2023-09-28 23:39:28,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:39:28,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:29,806 INFO [train.py:1039] (0/4) Epoch 6, batch 300, loss[loss=0.2509, simple_loss=0.322, pruned_loss=0.08989, over 24677.00 frames. ], tot_loss[loss=0.2424, simple_loss=0.3023, pruned_loss=0.09131, over 3674810.10 frames. ], batch size: 73, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:39:30,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 23:39:30,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:39:31,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:39:31,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 23:39:36,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:38,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:39:40,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:39:41,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 23:39:41,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:44,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:39:44,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 23:39:44,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:39:49,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:39:54,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:39:54,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 23:39:58,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 23:39:58,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:01,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:03,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:03,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 23:40:03,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:40:06,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:40:10,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:40:10,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:14,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:40:14,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 23:40:15,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=179200.0, ans=0.125 2023-09-28 23:40:16,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:40:17,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:19,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 23:40:21,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:21,769 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.33 vs. limit=22.5 2023-09-28 23:40:24,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:40:28,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:40:28,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 23:40:31,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:31,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:40:35,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:37,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:40:37,536 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.68 vs. limit=10.0 2023-09-28 23:40:38,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 23:40:38,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:40:39,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:40,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 23:40:43,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:43,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:43,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=179333.33333333334, ans=0.1 2023-09-28 23:40:45,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:46,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:46,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:48,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=179333.33333333334, ans=0.125 2023-09-28 23:40:51,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=179400.0, ans=0.125 2023-09-28 23:40:52,693 INFO [train.py:1039] (0/4) Epoch 6, batch 350, loss[loss=0.2554, simple_loss=0.3205, pruned_loss=0.09517, over 24093.00 frames. ], tot_loss[loss=0.2409, simple_loss=0.3005, pruned_loss=0.09062, over 3912531.76 frames. ], batch size: 86, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:40:52,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:40:52,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:40:53,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=179400.0, ans=0.1 2023-09-28 23:40:55,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:03,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:41:05,285 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.59 vs. limit=22.5 2023-09-28 23:41:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:07,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:10,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=179466.66666666666, ans=6.0 2023-09-28 23:41:10,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 23:41:10,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:11,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 23:41:14,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:15,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 23:41:16,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:21,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 23:41:21,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:41:24,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:24,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:41:24,916 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.16 vs. limit=6.0 2023-09-28 23:41:25,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:25,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:27,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:41:27,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:28,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:41:29,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=179533.33333333334, ans=0.125 2023-09-28 23:41:30,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:41:30,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:39,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:41:39,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:41:40,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:41:40,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:41,165 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=179600.0, ans=0.125 2023-09-28 23:41:45,461 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.204e+02 2.490e+02 2.803e+02 5.345e+02, threshold=4.981e+02, percent-clipped=1.0 2023-09-28 23:41:47,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 23:41:47,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:54,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:54,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:41:54,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:56,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 23:41:57,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:41:59,173 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 23:42:00,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 23:42:00,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:03,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:42:03,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 23:42:05,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:09,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:42:10,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:12,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:12,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:15,716 INFO [train.py:1039] (0/4) Epoch 6, batch 400, loss[loss=0.2425, simple_loss=0.3063, pruned_loss=0.08936, over 24674.00 frames. ], tot_loss[loss=0.2406, simple_loss=0.3002, pruned_loss=0.09047, over 4084335.75 frames. ], batch size: 65, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:42:15,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:17,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:42:21,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:42:21,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 23:42:21,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:23,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:25,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:42:25,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=179733.33333333334, ans=0.125 2023-09-28 23:42:26,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:29,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:29,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:31,521 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.16 vs. limit=15.0 2023-09-28 23:42:32,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 23:42:33,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 23:42:33,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:35,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 23:42:35,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:35,899 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=179800.0, ans=0.2 2023-09-28 23:42:39,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:42:39,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:40,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 23:42:41,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:42:41,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:41,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:41,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:43,226 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 23:42:43,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 23:42:50,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:50,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:52,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 23:42:53,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 23:42:57,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:42:59,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:02,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=179866.66666666666, ans=0.125 2023-09-28 23:43:03,914 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.89 vs. limit=15.0 2023-09-28 23:43:05,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 23:43:05,682 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.98 vs. limit=15.0 2023-09-28 23:43:08,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:43:11,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 23:43:12,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:43:14,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:43:15,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 23:43:19,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:43:19,937 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.35 vs. limit=15.0 2023-09-28 23:43:21,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:43:23,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:43:24,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:24,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 23:43:26,605 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=180000.0, ans=0.125 2023-09-28 23:43:29,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:43:31,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 23:43:32,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:43:32,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:43:36,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 23:43:38,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:43:38,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:43:38,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:43:39,731 INFO [train.py:1039] (0/4) Epoch 6, batch 450, loss[loss=0.2575, simple_loss=0.3111, pruned_loss=0.102, over 23754.00 frames. ], tot_loss[loss=0.2416, simple_loss=0.3012, pruned_loss=0.09101, over 4214374.53 frames. ], batch size: 212, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:43:41,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 23:43:41,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:43:41,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:43:42,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:43:42,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 23:43:43,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:43:44,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:43:46,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:43:54,862 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=180133.33333333334, ans=0.125 2023-09-28 23:43:59,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:59,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:01,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 23:44:02,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 23:44:06,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:44:10,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:12,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:17,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:17,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:19,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 23:44:20,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 23:44:22,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 23:44:22,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:44:23,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:25,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:44:28,914 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 23:44:28,928 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 23:44:28,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:30,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:44:31,953 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.131e+02 2.453e+02 2.864e+02 4.653e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-28 23:44:32,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:44:37,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:44:37,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:44:38,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:44:39,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 23:44:42,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:44,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:44:44,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=180333.33333333334, ans=0.05 2023-09-28 23:44:45,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:44:45,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 23:44:50,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:44:50,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 23:44:52,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 23:44:53,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:57,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:44:59,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:00,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:45:00,200 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 23:45:02,017 INFO [train.py:1039] (0/4) Epoch 6, batch 500, loss[loss=0.3247, simple_loss=0.3578, pruned_loss=0.1458, over 19427.00 frames. ], tot_loss[loss=0.2414, simple_loss=0.3014, pruned_loss=0.09073, over 4323525.70 frames. ], batch size: 388, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:45:05,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:06,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:45:06,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:06,868 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 23:45:09,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 23:45:09,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:13,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:45:17,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:45:17,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:45:19,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=180466.66666666666, ans=0.1 2023-09-28 23:45:19,375 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:45:20,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:20,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:20,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:31,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:33,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:45:33,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:45:33,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:33,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 23:45:35,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:45:38,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:45:40,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:45:40,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:45:40,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:40,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 23:45:43,408 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 23:45:47,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:45:48,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:48,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:45:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 23:45:55,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:45:56,326 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=180600.0, ans=0.1 2023-09-28 23:45:57,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:01,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:06,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:46:14,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:16,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 23:46:16,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:16,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:18,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 23:46:19,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:46:19,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:24,644 INFO [train.py:1039] (0/4) Epoch 6, batch 550, loss[loss=0.1958, simple_loss=0.2672, pruned_loss=0.06226, over 24599.00 frames. ], tot_loss[loss=0.2437, simple_loss=0.3031, pruned_loss=0.09213, over 4412948.37 frames. ], batch size: 60, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:46:28,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 23:46:28,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 23:46:28,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:29,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 23:46:30,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:46:30,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:30,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:32,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:33,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:46:33,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:46:36,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:38,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 23:46:38,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:46:42,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=180800.0, ans=0.2 2023-09-28 23:46:43,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:46:43,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:45,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:46:45,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=180800.0, ans=0.07 2023-09-28 23:46:47,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:51,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 23:46:51,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 23:46:53,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:47:00,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:47:00,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:02,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:47:05,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:05,909 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 23:47:07,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:47:08,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:47:13,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:13,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:47:13,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:47:13,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:15,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 23:47:17,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 23:47:18,454 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.230e+02 2.579e+02 3.045e+02 5.000e+02, threshold=5.158e+02, percent-clipped=1.0 2023-09-28 23:47:18,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:18,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:47:20,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:47:20,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:47:23,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:47:24,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:47:26,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:47:27,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:29,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:47:29,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:47:33,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:34,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:47:34,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:36,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:47:36,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:47:44,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 23:47:47,848 INFO [train.py:1039] (0/4) Epoch 6, batch 600, loss[loss=0.2086, simple_loss=0.2722, pruned_loss=0.07249, over 24350.00 frames. ], tot_loss[loss=0.244, simple_loss=0.3036, pruned_loss=0.09218, over 4462029.53 frames. ], batch size: 56, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:47:49,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 23:47:51,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:47:51,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:47:51,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:57,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:47:59,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:48:00,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 23:48:03,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:48:04,641 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.37 vs. limit=22.5 2023-09-28 23:48:06,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:07,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:08,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=181133.33333333334, ans=0.125 2023-09-28 23:48:09,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 23:48:09,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:48:18,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 23:48:21,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:48:21,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:21,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:48:29,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:48:29,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:48:29,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:33,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=181200.0, ans=0.2 2023-09-28 23:48:35,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=181266.66666666666, ans=0.1 2023-09-28 23:48:37,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=181266.66666666666, ans=0.0 2023-09-28 23:48:38,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:48:42,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:42,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:42,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:51,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 23:48:55,919 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.38 vs. limit=15.0 2023-09-28 23:48:56,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:48:56,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:48:57,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=181333.33333333334, ans=0.1 2023-09-28 23:49:03,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 23:49:03,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:49:06,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 23:49:06,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:49:06,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:49:06,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=181333.33333333334, ans=0.1 2023-09-28 23:49:11,013 INFO [train.py:1039] (0/4) Epoch 6, batch 650, loss[loss=0.2179, simple_loss=0.255, pruned_loss=0.09045, over 19572.00 frames. ], tot_loss[loss=0.2426, simple_loss=0.3023, pruned_loss=0.09148, over 4498258.27 frames. ], batch size: 388, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:49:13,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:49:14,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:49:16,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:49:16,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:49:19,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:19,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=181400.0, ans=0.125 2023-09-28 23:49:23,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 23:49:23,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:49:30,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:49:30,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:35,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:38,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 23:49:39,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:49:40,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:43,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:49:45,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 23:49:47,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:48,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:49:49,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=181533.33333333334, ans=0.1 2023-09-28 23:49:50,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:50,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:49:52,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:49:54,012 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 23:49:54,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:54,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:49:54,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=181533.33333333334, ans=0.05 2023-09-28 23:49:57,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:57,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=181533.33333333334, ans=0.0 2023-09-28 23:49:58,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:49:58,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:49:58,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:49:59,659 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.74 vs. limit=15.0 2023-09-28 23:50:00,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 23:50:01,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:50:01,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:50:05,126 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.251e+02 2.578e+02 2.975e+02 4.088e+02, threshold=5.156e+02, percent-clipped=0.0 2023-09-28 23:50:05,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:50:05,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:50:05,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:50:08,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 23:50:08,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 23:50:09,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:09,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:50:09,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:50:09,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:50:10,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:50:18,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:18,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:50:21,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:50:24,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:24,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 23:50:25,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:33,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:50:33,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:33,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:33,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=181733.33333333334, ans=0.0 2023-09-28 23:50:34,416 INFO [train.py:1039] (0/4) Epoch 6, batch 700, loss[loss=0.2178, simple_loss=0.28, pruned_loss=0.07778, over 24337.00 frames. ], tot_loss[loss=0.2397, simple_loss=0.2997, pruned_loss=0.08989, over 4554641.43 frames. ], batch size: 56, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:50:34,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:37,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 23:50:37,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 23:50:41,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 23:50:43,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:45,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:50:48,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 23:50:49,332 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.87 vs. limit=5.0 2023-09-28 23:50:52,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:53,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=181800.0, ans=0.2 2023-09-28 23:50:55,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:50:56,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:58,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:50:58,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:51:02,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:51:05,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 23:51:05,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:51:08,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 23:51:12,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 23:51:16,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:51:16,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:51:17,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:51:22,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:51:22,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 23:51:29,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:29,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:51:29,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 23:51:34,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:51:34,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:37,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:51:42,947 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=182000.0, ans=0.0 2023-09-28 23:51:44,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:51:44,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 23:51:44,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=182000.0, ans=0.125 2023-09-28 23:51:47,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 23:51:47,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 23:51:50,620 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.62 vs. limit=22.5 2023-09-28 23:51:51,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:52,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:51:54,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:51:56,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:56,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 23:51:57,419 INFO [train.py:1039] (0/4) Epoch 6, batch 750, loss[loss=0.2096, simple_loss=0.2776, pruned_loss=0.07077, over 24345.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.2985, pruned_loss=0.08895, over 4598494.39 frames. ], batch size: 56, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:52:02,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 23:52:02,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 23:52:03,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 23:52:03,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 23:52:05,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 23:52:05,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:52:07,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 23:52:09,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:52:09,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:12,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:14,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:15,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:52:15,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:17,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:52:18,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:52:21,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:52:24,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:25,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:25,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 23:52:27,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:52:28,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:30,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:32,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:52:32,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 23:52:32,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:52:35,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 23:52:35,507 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 23:52:36,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 23:52:37,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:52:37,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:52:39,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:52:44,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:44,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:52:44,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:52:44,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=182200.0, ans=0.0 2023-09-28 23:52:47,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:49,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:51,234 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.788e+02 2.168e+02 2.495e+02 2.811e+02 4.815e+02, threshold=4.990e+02, percent-clipped=0.0 2023-09-28 23:52:51,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 23:52:51,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:52:53,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:52:53,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:52:56,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:52:56,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 23:52:57,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:05,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:05,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:53:07,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:10,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:53:14,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 23:53:14,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:14,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:16,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:18,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:18,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=182333.33333333334, ans=0.1 2023-09-28 23:53:21,139 INFO [train.py:1039] (0/4) Epoch 6, batch 800, loss[loss=0.2053, simple_loss=0.2769, pruned_loss=0.06689, over 24587.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.2996, pruned_loss=0.08871, over 4614377.96 frames. ], batch size: 60, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:53:21,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:21,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:53:29,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:29,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:31,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:31,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:34,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:34,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:35,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:40,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:41,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:53:44,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 23:53:44,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:45,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:47,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:53:47,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:47,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 23:53:47,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:47,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 23:53:49,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:51,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:54,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:54,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:56,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:58,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:54:02,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:54:04,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:54:04,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:54:07,647 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 23:54:09,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 23:54:09,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:54:09,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:09,492 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=182600.0, ans=0.1 2023-09-28 23:54:09,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=182600.0, ans=0.125 2023-09-28 23:54:12,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:12,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:18,752 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 23:54:18,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 23:54:21,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:54:22,666 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.02 vs. limit=15.0 2023-09-28 23:54:23,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:54:25,943 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.04 vs. limit=15.0 2023-09-28 23:54:27,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:54:30,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:54:31,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 23:54:33,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:54:35,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=182666.66666666666, ans=0.125 2023-09-28 23:54:37,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 23:54:42,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:54:43,407 INFO [train.py:1039] (0/4) Epoch 6, batch 850, loss[loss=0.2235, simple_loss=0.2911, pruned_loss=0.07799, over 24343.00 frames. ], tot_loss[loss=0.2389, simple_loss=0.2999, pruned_loss=0.08898, over 4634752.38 frames. ], batch size: 61, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:54:45,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:54:45,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 23:54:45,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:54:48,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:48,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 23:54:48,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:49,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:54:52,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:54:53,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:54:55,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:56,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 23:54:56,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 23:54:58,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 23:54:59,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:55:00,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:02,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:03,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:03,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:55:08,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:08,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:10,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 23:55:11,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 23:55:14,222 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.33 vs. limit=10.0 2023-09-28 23:55:14,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:16,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 23:55:21,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 23:55:22,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 23:55:23,393 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.32 vs. limit=15.0 2023-09-28 23:55:24,285 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 23:55:26,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:26,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:55:26,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:55:27,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:28,415 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.77 vs. limit=22.5 2023-09-28 23:55:29,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:29,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 23:55:33,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:33,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:35,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:55:36,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:55:37,985 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.156e+02 2.378e+02 2.757e+02 3.805e+02, threshold=4.755e+02, percent-clipped=0.0 2023-09-28 23:55:38,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:55:38,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=182933.33333333334, ans=0.125 2023-09-28 23:55:39,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:55:39,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 23:55:42,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:55:42,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:55:45,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:55:45,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:46,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:48,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:50,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:55:52,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:55:53,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:53,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:55:56,359 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.38 vs. limit=22.5 2023-09-28 23:56:02,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:56:03,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:56:03,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=183066.66666666666, ans=0.125 2023-09-28 23:56:05,005 INFO [train.py:1039] (0/4) Epoch 6, batch 900, loss[loss=0.2307, simple_loss=0.3078, pruned_loss=0.07686, over 24626.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3011, pruned_loss=0.08988, over 4651840.74 frames. ], batch size: 68, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:56:05,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 23:56:05,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:05,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:56:06,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 23:56:13,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:56:18,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:20,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 23:56:21,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:56:23,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 23:56:23,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:56:24,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:24,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:25,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:56:25,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:56:25,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=183133.33333333334, ans=0.125 2023-09-28 23:56:29,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=183133.33333333334, ans=0.09899494936611666 2023-09-28 23:56:38,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:56:38,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:38,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:56:42,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:47,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 23:56:47,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:56:52,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:56:52,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:56:52,992 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 23:56:54,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 23:57:00,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:57:00,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:57:01,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:57:06,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=183266.66666666666, ans=0.125 2023-09-28 23:57:09,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:09,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:11,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 23:57:11,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:57:14,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 23:57:15,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:57:15,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:17,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:57:19,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:23,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 23:57:23,274 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 23:57:24,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:57:24,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 23:57:29,116 INFO [train.py:1039] (0/4) Epoch 6, batch 950, loss[loss=0.316, simple_loss=0.3435, pruned_loss=0.1443, over 19897.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.3012, pruned_loss=0.08992, over 4656673.41 frames. ], batch size: 389, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:57:29,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:32,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 23:57:33,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=183400.0, ans=0.0 2023-09-28 23:57:37,354 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.48 vs. limit=15.0 2023-09-28 23:57:38,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:39,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:40,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:41,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:57:43,117 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 23:57:46,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:48,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:57:50,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:50,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:57:50,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 23:57:50,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:57:52,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=183466.66666666666, ans=0.1 2023-09-28 23:57:53,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:55,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 23:57:55,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:59,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:58:00,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 23:58:02,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:58:05,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:58:05,632 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=183533.33333333334, ans=0.125 2023-09-28 23:58:06,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:58:10,211 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=183533.33333333334, ans=0.125 2023-09-28 23:58:11,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:58:11,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:58:13,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=183533.33333333334, ans=0.2 2023-09-28 23:58:13,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=183533.33333333334, ans=0.95 2023-09-28 23:58:14,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 23:58:18,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 23:58:18,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:58:18,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:20,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:20,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:58:24,308 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.57 vs. limit=12.0 2023-09-28 23:58:25,029 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.262e+02 2.553e+02 3.079e+02 4.621e+02, threshold=5.106e+02, percent-clipped=0.0 2023-09-28 23:58:25,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 23:58:26,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:58:29,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:29,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:29,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 23:58:32,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:32,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:58:32,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 23:58:36,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:58:38,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:43,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:58:43,669 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=183666.66666666666, ans=0.0 2023-09-28 23:58:44,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 23:58:44,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 23:58:49,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:51,512 INFO [train.py:1039] (0/4) Epoch 6, batch 1000, loss[loss=0.237, simple_loss=0.268, pruned_loss=0.103, over 19554.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.3003, pruned_loss=0.08963, over 4662802.67 frames. ], batch size: 388, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:58:51,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=183733.33333333334, ans=0.0 2023-09-28 23:58:53,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 23:58:53,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=183733.33333333334, ans=0.0 2023-09-28 23:58:54,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:58:56,740 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=183733.33333333334, ans=0.125 2023-09-28 23:59:00,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:59:01,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 23:59:01,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 23:59:08,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:08,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:59:08,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:13,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 23:59:16,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 23:59:16,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=183800.0, ans=0.025 2023-09-28 23:59:19,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 23:59:19,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:21,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 23:59:21,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=183800.0, ans=0.0 2023-09-28 23:59:22,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 23:59:22,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 23:59:24,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:27,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:34,302 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.63 vs. limit=15.0 2023-09-28 23:59:35,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:35,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:59:37,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:38,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:38,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 23:59:38,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:40,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:59:40,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:41,694 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 23:59:44,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 23:59:45,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 23:59:48,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 23:59:50,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:59:55,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:56,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:59:57,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:58,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:00:01,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 00:00:02,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=184000.0, ans=0.125 2023-09-29 00:00:03,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:00:03,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 00:00:05,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 00:00:05,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=184000.0, ans=0.125 2023-09-29 00:00:06,413 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.09 vs. limit=15.0 2023-09-29 00:00:07,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:07,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:00:09,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:00:12,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:00:15,319 INFO [train.py:1039] (0/4) Epoch 6, batch 1050, loss[loss=0.2133, simple_loss=0.2974, pruned_loss=0.06459, over 24631.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.2979, pruned_loss=0.08826, over 4673315.07 frames. ], batch size: 68, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:00:15,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:00:19,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:00:20,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:00:22,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:00:24,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:25,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:27,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:00:28,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:00:30,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:00:30,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=184133.33333333334, ans=0.1 2023-09-29 00:00:32,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:00:32,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:00:34,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:00:34,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 00:00:36,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:36,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 00:00:39,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:40,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 00:00:40,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:00:44,170 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=184133.33333333334, ans=0.1 2023-09-29 00:00:47,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:48,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:00:48,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:50,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 00:00:52,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 00:00:52,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:54,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 00:00:57,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 00:00:57,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=184200.0, ans=0.125 2023-09-29 00:00:58,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:00:59,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=184200.0, ans=0.2 2023-09-29 00:01:00,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:01:01,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:01:01,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:02,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:01:05,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=184266.66666666666, ans=0.2 2023-09-29 00:01:08,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:01:12,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 00:01:12,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=184266.66666666666, ans=0.1 2023-09-29 00:01:14,314 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.070e+02 2.251e+02 2.597e+02 4.023e+02, threshold=4.502e+02, percent-clipped=0.0 2023-09-29 00:01:14,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 00:01:14,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 00:01:15,666 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.81 vs. limit=8.0 2023-09-29 00:01:16,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:16,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:01:17,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 00:01:18,067 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=184266.66666666666, ans=0.0 2023-09-29 00:01:22,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:01:24,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:24,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:01:24,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:24,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:30,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:30,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 00:01:32,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:32,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 00:01:32,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 00:01:33,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:01:38,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:01:40,013 INFO [train.py:1039] (0/4) Epoch 6, batch 1100, loss[loss=0.2531, simple_loss=0.3251, pruned_loss=0.09056, over 24435.00 frames. ], tot_loss[loss=0.2375, simple_loss=0.2978, pruned_loss=0.08863, over 4670985.76 frames. ], batch size: 77, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:01:44,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:01:49,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:01:51,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:01:51,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:01:51,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 00:01:53,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:56,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:01:56,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=184466.66666666666, ans=0.0 2023-09-29 00:01:58,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:02:01,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:02:01,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 00:02:03,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:02:04,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:04,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:02:07,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:02:09,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:02:13,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=184533.33333333334, ans=0.2 2023-09-29 00:02:14,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:02:17,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 00:02:18,012 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 00:02:20,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:20,889 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=10.79 vs. limit=15.0 2023-09-29 00:02:23,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:23,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:02:23,524 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:02:25,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:02:26,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 00:02:28,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:02:28,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:02:28,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:02:28,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:28,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 00:02:35,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:02:35,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 00:02:36,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:02:41,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:02:43,481 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=184600.0, ans=0.0 2023-09-29 00:02:44,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 00:02:44,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:02:46,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:48,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:49,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:51,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 00:02:53,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:02:53,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:54,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 00:02:55,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:02:57,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 00:02:57,369 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=184666.66666666666, ans=0.125 2023-09-29 00:02:58,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:02:58,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:03:00,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:03:04,736 INFO [train.py:1039] (0/4) Epoch 6, batch 1150, loss[loss=0.2228, simple_loss=0.3001, pruned_loss=0.07278, over 24669.00 frames. ], tot_loss[loss=0.2384, simple_loss=0.2988, pruned_loss=0.08896, over 4684637.43 frames. ], batch size: 68, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:03:06,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:10,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:03:11,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:11,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:03:11,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 00:03:13,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:14,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 00:03:16,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:16,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:03:16,860 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=184733.33333333334, ans=0.125 2023-09-29 00:03:21,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 00:03:24,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:28,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=184800.0, ans=0.025 2023-09-29 00:03:29,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:31,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:31,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 00:03:32,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:03:32,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:35,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 00:03:37,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:37,806 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=184866.66666666666, ans=0.125 2023-09-29 00:03:38,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:39,887 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.53 vs. limit=6.0 2023-09-29 00:03:48,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 00:03:56,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:03:56,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:00,844 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.083e+02 2.291e+02 2.736e+02 4.000e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-29 00:04:01,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=184933.33333333334, ans=0.04949747468305833 2023-09-29 00:04:01,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=184933.33333333334, ans=0.0 2023-09-29 00:04:03,579 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 00:04:05,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:15,227 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 00:04:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:21,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:04:21,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:04:23,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:04:26,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:27,744 INFO [train.py:1039] (0/4) Epoch 6, batch 1200, loss[loss=0.2326, simple_loss=0.2866, pruned_loss=0.08928, over 23547.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.2996, pruned_loss=0.08876, over 4696271.76 frames. ], batch size: 134, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:04:30,992 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=185066.66666666666, ans=0.125 2023-09-29 00:04:32,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:04:32,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:04:33,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:33,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:33,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:04:35,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:04:36,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:04:40,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:40,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:43,787 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 00:04:47,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 00:04:48,075 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.42 vs. limit=22.5 2023-09-29 00:04:51,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:04:54,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:04:58,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:59,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:04:59,557 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 00:05:01,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:07,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:05:07,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:05:07,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 00:05:07,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:05:11,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 00:05:17,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 00:05:17,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:17,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=185266.66666666666, ans=0.125 2023-09-29 00:05:18,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:05:20,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:20,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:05:22,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:05:22,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:05:24,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:05:24,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 00:05:24,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:05:25,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:25,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:05:29,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:29,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:33,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:05:35,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:05:39,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 00:05:42,977 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 00:05:44,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:05:45,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:47,249 INFO [train.py:1039] (0/4) Epoch 6, batch 1250, loss[loss=0.2094, simple_loss=0.2947, pruned_loss=0.06206, over 24652.00 frames. ], tot_loss[loss=0.2395, simple_loss=0.3005, pruned_loss=0.08927, over 4699998.65 frames. ], batch size: 73, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:05:48,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:05:51,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:54,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 00:05:56,366 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=185400.0, ans=0.0 2023-09-29 00:05:57,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:05:58,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:05:59,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 00:06:00,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:06:02,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:06:05,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:06:08,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:08,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:06:08,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:11,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:06:13,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=185466.66666666666, ans=0.125 2023-09-29 00:06:15,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:06:15,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:06:15,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:06:17,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:17,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:20,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:22,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:06:29,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 00:06:30,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:06:35,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:35,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 00:06:35,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:36,627 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 00:06:36,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:36,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:38,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,094 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.172e+02 2.410e+02 2.804e+02 3.996e+02, threshold=4.819e+02, percent-clipped=0.0 2023-09-29 00:06:42,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:06:42,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=185600.0, ans=0.125 2023-09-29 00:06:43,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 00:06:43,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 00:06:43,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 00:06:48,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:06:49,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 00:06:49,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:52,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:06:52,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:06:56,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 00:06:56,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:06:56,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:06:56,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:06:57,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:58,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 00:07:02,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:03,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:07:04,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:07:06,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:07:07,755 INFO [train.py:1039] (0/4) Epoch 6, batch 1300, loss[loss=0.2307, simple_loss=0.2841, pruned_loss=0.08862, over 23558.00 frames. ], tot_loss[loss=0.2407, simple_loss=0.3014, pruned_loss=0.09005, over 4694282.29 frames. ], batch size: 134, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:07:11,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:11,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 00:07:11,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=185733.33333333334, ans=0.125 2023-09-29 00:07:14,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:15,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:07:17,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:18,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:07:20,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:07:21,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 00:07:22,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=185800.0, ans=0.125 2023-09-29 00:07:25,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:07:25,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=185800.0, ans=0.2 2023-09-29 00:07:27,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:07:29,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 00:07:33,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:07:37,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:37,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:07:38,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:39,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:39,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=185866.66666666666, ans=0.07 2023-09-29 00:07:40,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:07:40,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:07:42,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 00:07:48,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:07:48,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:07:49,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 00:07:49,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:07:52,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:07:55,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:57,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 00:07:58,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:07:58,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 00:07:59,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:02,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:02,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:08:04,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 00:08:05,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 00:08:07,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 00:08:11,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:08:14,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 00:08:14,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=186000.0, ans=0.0 2023-09-29 00:08:17,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:24,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 00:08:27,628 INFO [train.py:1039] (0/4) Epoch 6, batch 1350, loss[loss=0.2656, simple_loss=0.3094, pruned_loss=0.1109, over 23776.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.3011, pruned_loss=0.09077, over 4689340.16 frames. ], batch size: 150, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:08:27,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:29,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:32,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:33,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:36,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:08:36,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:39,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:40,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=186066.66666666666, ans=0.125 2023-09-29 00:08:41,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 00:08:43,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:08:44,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:08:47,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 00:08:47,956 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=186133.33333333334, ans=0.125 2023-09-29 00:08:49,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:50,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:50,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 00:08:51,334 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.48 vs. limit=15.0 2023-09-29 00:08:53,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 00:08:55,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 00:08:55,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=186133.33333333334, ans=0.2 2023-09-29 00:08:58,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:58,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 00:09:04,346 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=186200.0, ans=0.0 2023-09-29 00:09:11,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:18,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=186266.66666666666, ans=0.125 2023-09-29 00:09:21,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:22,609 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.119e+02 2.458e+02 2.800e+02 4.358e+02, threshold=4.916e+02, percent-clipped=0.0 2023-09-29 00:09:22,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:22,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 00:09:25,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:27,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 00:09:27,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:09:28,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:09:29,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=186266.66666666666, ans=0.0 2023-09-29 00:09:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:09:33,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 00:09:33,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=186333.33333333334, ans=0.2 2023-09-29 00:09:34,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:09:40,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 00:09:40,426 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=186333.33333333334, ans=0.125 2023-09-29 00:09:43,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 00:09:48,780 INFO [train.py:1039] (0/4) Epoch 6, batch 1400, loss[loss=0.2117, simple_loss=0.2436, pruned_loss=0.0899, over 19042.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.2993, pruned_loss=0.08978, over 4678907.59 frames. ], batch size: 388, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:09:48,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 00:09:50,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:50,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=186400.0, ans=0.125 2023-09-29 00:09:53,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:09:55,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:09:57,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=186400.0, ans=0.0 2023-09-29 00:09:58,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 00:10:00,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 00:10:10,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:10:12,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:16,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:10:17,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:10:19,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=186533.33333333334, ans=0.125 2023-09-29 00:10:21,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:10:22,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 00:10:27,889 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:10:30,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:30,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:35,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 00:10:36,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:10:36,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:10:39,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:10:39,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:41,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:10:42,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:10:42,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:10:44,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 00:10:44,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=186600.0, ans=0.125 2023-09-29 00:10:45,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:10:49,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:51,134 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-28000.pt 2023-09-29 00:10:56,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:11:03,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 00:11:05,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:11:05,228 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=186666.66666666666, ans=0.125 2023-09-29 00:11:06,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:11:09,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 00:11:10,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:11,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:11:12,502 INFO [train.py:1039] (0/4) Epoch 6, batch 1450, loss[loss=0.2453, simple_loss=0.3047, pruned_loss=0.09298, over 23792.00 frames. ], tot_loss[loss=0.239, simple_loss=0.2993, pruned_loss=0.08935, over 4690501.84 frames. ], batch size: 195, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:11:15,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:11:15,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:11:17,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:17,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:11:22,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:23,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:11:25,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:11:25,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 00:11:27,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:11:27,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 00:11:28,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:30,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:30,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 00:11:32,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:11:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:11:34,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 00:11:34,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:34,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=186800.0, ans=0.04949747468305833 2023-09-29 00:11:36,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:11:36,964 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.18 vs. limit=22.5 2023-09-29 00:11:37,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:39,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:43,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:11:43,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:11:45,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:46,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:48,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:48,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:11:48,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:49,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:11:52,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=186866.66666666666, ans=0.1 2023-09-29 00:11:53,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 00:11:56,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:12:00,666 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 00:12:02,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:04,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:12:06,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:08,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 00:12:09,775 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.135e+02 2.497e+02 3.099e+02 5.077e+02, threshold=4.994e+02, percent-clipped=1.0 2023-09-29 00:12:11,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=186933.33333333334, ans=0.125 2023-09-29 00:12:12,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:13,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=186933.33333333334, ans=0.0 2023-09-29 00:12:14,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 00:12:14,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 00:12:15,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:18,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:20,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:20,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 00:12:23,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 00:12:23,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 00:12:25,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:26,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:12:29,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=187000.0, ans=0.05 2023-09-29 00:12:33,028 INFO [train.py:1039] (0/4) Epoch 6, batch 1500, loss[loss=0.2249, simple_loss=0.2967, pruned_loss=0.07657, over 24100.00 frames. ], tot_loss[loss=0.2388, simple_loss=0.2999, pruned_loss=0.08887, over 4705192.87 frames. ], batch size: 80, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:12:38,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 00:12:38,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:12:38,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:12:40,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:41,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:42,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:12:43,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 00:12:45,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:12:45,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:12:45,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:46,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:46,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=187066.66666666666, ans=0.125 2023-09-29 00:12:48,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:12:48,737 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.23 vs. limit=6.0 2023-09-29 00:12:49,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:52,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:54,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 00:12:54,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:12:55,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:12:55,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:58,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 00:13:02,035 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:13:03,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 00:13:06,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:13:06,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 00:13:11,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:13:13,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:14,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:13:14,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:13:16,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 00:13:16,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:13:16,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:17,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 00:13:17,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:21,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=187266.66666666666, ans=0.125 2023-09-29 00:13:25,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:13:25,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 00:13:28,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:13:31,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:13:35,663 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 00:13:35,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:37,111 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 00:13:37,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:13:38,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:13:40,801 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 00:13:42,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:13:46,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 00:13:48,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:52,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:53,964 INFO [train.py:1039] (0/4) Epoch 6, batch 1550, loss[loss=0.2453, simple_loss=0.2923, pruned_loss=0.09915, over 23670.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.3001, pruned_loss=0.0885, over 4718596.14 frames. ], batch size: 149, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:13:54,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:54,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:55,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 00:13:55,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 00:13:55,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:13:57,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 00:13:57,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 00:14:00,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:01,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:03,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:03,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:14:03,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:04,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:07,750 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 00:14:07,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:07,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:14:09,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:14:10,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:14:11,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 00:14:12,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:13,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 00:14:14,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 00:14:14,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 00:14:14,668 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.18 vs. limit=22.5 2023-09-29 00:14:16,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:17,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:21,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:14:24,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 00:14:24,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 00:14:33,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:35,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=187533.33333333334, ans=0.07 2023-09-29 00:14:36,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:36,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:14:36,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:14:36,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 00:14:37,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.56 vs. limit=15.0 2023-09-29 00:14:39,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=187600.0, ans=0.5 2023-09-29 00:14:41,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:14:42,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:43,257 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.71 vs. limit=12.0 2023-09-29 00:14:45,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:14:48,222 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.43 vs. limit=22.5 2023-09-29 00:14:49,287 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.090e+02 2.378e+02 2.713e+02 3.704e+02, threshold=4.756e+02, percent-clipped=0.0 2023-09-29 00:14:49,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:14:49,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:49,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 00:14:49,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:14:51,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:14:51,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:53,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:14:53,765 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 00:14:56,134 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.03 vs. limit=15.0 2023-09-29 00:14:56,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:01,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 00:15:07,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:08,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:09,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 00:15:10,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:15:12,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:12,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:15:12,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:15:13,498 INFO [train.py:1039] (0/4) Epoch 6, batch 1600, loss[loss=0.2108, simple_loss=0.2829, pruned_loss=0.06931, over 24482.00 frames. ], tot_loss[loss=0.2398, simple_loss=0.301, pruned_loss=0.08934, over 4700177.92 frames. ], batch size: 63, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:15:13,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:15:16,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:17,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 00:15:17,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 00:15:21,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 00:15:25,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:15:26,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 00:15:28,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:15:30,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:15:33,926 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=187800.0, ans=0.1 2023-09-29 00:15:36,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:15:40,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 00:15:42,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:15:43,317 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.81 vs. limit=22.5 2023-09-29 00:15:43,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 00:15:45,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:45,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 00:15:50,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 00:15:58,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:59,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 00:16:00,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:16:01,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:01,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:16:05,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:16:09,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:16:10,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:16:11,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:16:14,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:16:16,439 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.67 vs. limit=15.0 2023-09-29 00:16:17,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:16:17,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=188000.0, ans=0.125 2023-09-29 00:16:18,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:16:22,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=188000.0, ans=0.0 2023-09-29 00:16:23,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:23,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=188000.0, ans=0.125 2023-09-29 00:16:25,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:16:27,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 00:16:28,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:16:28,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 00:16:33,095 INFO [train.py:1039] (0/4) Epoch 6, batch 1650, loss[loss=0.2377, simple_loss=0.3105, pruned_loss=0.08245, over 24420.00 frames. ], tot_loss[loss=0.2395, simple_loss=0.3015, pruned_loss=0.08871, over 4725100.25 frames. ], batch size: 77, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:16:36,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:36,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=188066.66666666666, ans=0.125 2023-09-29 00:16:37,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:16:37,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:16:39,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 00:16:39,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 00:16:39,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 00:16:39,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 00:16:42,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:43,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:44,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:16:44,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:16:47,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:48,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 00:16:51,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:51,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:51,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:16:51,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:16:53,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 00:16:53,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 00:16:58,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:16:59,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:17:08,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 00:17:10,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:12,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 00:17:16,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:17:19,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:17:20,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:21,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:17:21,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:24,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=188266.66666666666, ans=0.125 2023-09-29 00:17:25,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:25,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:25,802 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=188266.66666666666, ans=0.0 2023-09-29 00:17:27,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:27,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:28,446 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 2.178e+02 2.493e+02 2.802e+02 6.343e+02, threshold=4.987e+02, percent-clipped=2.0 2023-09-29 00:17:28,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:28,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:17:31,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:33,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 00:17:34,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:34,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 00:17:37,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 00:17:37,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 00:17:39,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:39,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:17:40,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:40,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:40,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 00:17:45,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:46,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:17:47,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:50,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 00:17:53,485 INFO [train.py:1039] (0/4) Epoch 6, batch 1700, loss[loss=0.2397, simple_loss=0.2833, pruned_loss=0.0981, over 23612.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.3006, pruned_loss=0.08914, over 4710223.57 frames. ], batch size: 256, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:17:55,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:55,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:17:55,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 00:17:55,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:17:56,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:17:56,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:59,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:17:59,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:17:59,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 00:18:01,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=188400.0, ans=0.0 2023-09-29 00:18:02,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:18:09,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:18:13,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:18:15,356 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.46 vs. limit=6.0 2023-09-29 00:18:16,538 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=188466.66666666666, ans=0.125 2023-09-29 00:18:19,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:18:21,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:21,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:18:21,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:24,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 00:18:24,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:18:25,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:27,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:18:27,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:18:29,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 00:18:30,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 00:18:32,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:33,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 00:18:35,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:18:44,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:18:46,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:46,957 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.75 vs. limit=12.0 2023-09-29 00:18:47,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:49,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:18:49,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 00:18:49,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:51,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:51,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 00:18:53,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:18:53,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:53,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:53,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:18:56,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:56,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:18:57,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:57,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:18:57,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:19:02,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:05,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 00:19:05,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:19:07,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:08,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 00:19:13,270 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.13 vs. limit=22.5 2023-09-29 00:19:16,069 INFO [train.py:1039] (0/4) Epoch 6, batch 1750, loss[loss=0.2239, simple_loss=0.2816, pruned_loss=0.08315, over 23300.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.2981, pruned_loss=0.08835, over 4695695.84 frames. ], batch size: 105, lr: 1.72e-02, grad_scale: 32.0 2023-09-29 00:19:16,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=188733.33333333334, ans=0.07 2023-09-29 00:19:17,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:21,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:21,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:19:22,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 00:19:22,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:19:26,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:19:26,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:29,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 00:19:32,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:35,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 00:19:35,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:19:37,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:19:38,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:19:40,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 00:19:43,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:19:43,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 00:19:53,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:19:55,401 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=188866.66666666666, ans=0.1 2023-09-29 00:19:57,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:19:57,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:00,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:00,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:03,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:05,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:05,979 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=188933.33333333334, ans=0.125 2023-09-29 00:20:07,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:07,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:20:08,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 00:20:10,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:12,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 00:20:13,403 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.714e+02 2.157e+02 2.511e+02 2.908e+02 4.872e+02, threshold=5.023e+02, percent-clipped=0.0 2023-09-29 00:20:13,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:16,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:18,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:20:20,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=189000.0, ans=0.125 2023-09-29 00:20:21,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:20:23,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 00:20:23,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:25,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:31,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:34,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:20:35,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:20:37,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 00:20:37,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:38,619 INFO [train.py:1039] (0/4) Epoch 6, batch 1800, loss[loss=0.2308, simple_loss=0.2883, pruned_loss=0.08664, over 17322.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.2976, pruned_loss=0.08842, over 4685823.41 frames. ], batch size: 37, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:20:38,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:20:38,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:38,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:20:38,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:20:39,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=189066.66666666666, ans=0.07 2023-09-29 00:20:40,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:20:42,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:20:43,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:44,525 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.85 vs. limit=15.0 2023-09-29 00:20:45,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:20:48,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:49,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=189066.66666666666, ans=0.1 2023-09-29 00:20:51,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:20:52,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:55,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:20:57,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:59,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:59,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:21:03,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:21:03,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 00:21:04,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:08,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:12,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 00:21:15,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 00:21:15,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 00:21:15,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:17,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:21:17,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:21:19,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:21:19,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=189200.0, ans=0.1 2023-09-29 00:21:22,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=189200.0, ans=0.125 2023-09-29 00:21:23,914 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 00:21:25,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:21:28,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:29,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 00:21:31,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 00:21:32,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:21:33,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:21:35,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:21:39,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 00:21:48,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:21:48,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 00:21:48,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:21:48,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:50,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:21:50,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 00:21:53,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:21:53,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:21:56,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 00:21:56,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:59,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:21:59,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:21:59,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:59,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:22:00,772 INFO [train.py:1039] (0/4) Epoch 6, batch 1850, loss[loss=0.2119, simple_loss=0.2887, pruned_loss=0.06756, over 24469.00 frames. ], tot_loss[loss=0.2373, simple_loss=0.2978, pruned_loss=0.08841, over 4695716.70 frames. ], batch size: 66, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:22:00,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:22:02,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:22:02,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:04,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=189400.0, ans=0.125 2023-09-29 00:22:06,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:22:06,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:10,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=189400.0, ans=0.125 2023-09-29 00:22:14,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:22:16,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 00:22:18,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 00:22:22,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 00:22:25,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:22:27,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 00:22:27,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 00:22:36,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:22:40,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 00:22:41,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:22:43,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:22:48,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 00:22:49,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:49,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:22:50,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:22:53,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:56,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:59,537 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.153e+02 2.382e+02 2.790e+02 3.964e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 00:22:59,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:22:59,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:59,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:23:01,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:02,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:02,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:23:06,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 00:23:07,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:10,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:23:10,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:23:10,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 00:23:10,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 00:23:14,253 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 00:23:14,394 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 00:23:17,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:23:17,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:23:17,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:17,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:19,346 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 00:23:19,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:23:19,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:19,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:23:21,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:23:22,515 INFO [train.py:1039] (0/4) Epoch 6, batch 1900, loss[loss=0.2474, simple_loss=0.3212, pruned_loss=0.08683, over 24311.00 frames. ], tot_loss[loss=0.2386, simple_loss=0.2993, pruned_loss=0.08894, over 4698127.79 frames. ], batch size: 74, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:23:22,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:23:22,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 00:23:26,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:26,249 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 00:23:26,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:23:28,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:32,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:35,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:23:37,367 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 00:23:37,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 00:23:39,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:40,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:23:40,604 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 00:23:40,645 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 00:23:45,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 00:23:47,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:23:47,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=189800.0, ans=0.1 2023-09-29 00:23:50,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 00:23:53,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 00:24:03,525 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.53 vs. limit=15.0 2023-09-29 00:24:04,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 00:24:07,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 00:24:07,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:07,240 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 00:24:07,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 00:24:07,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 00:24:07,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=189866.66666666666, ans=0.1 2023-09-29 00:24:08,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 00:24:08,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:24:13,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 00:24:17,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:24:20,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:20,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 00:24:24,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:24:27,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 00:24:27,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:33,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:24:33,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:24:33,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:24:34,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:24:36,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:24:37,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:24:37,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:24:37,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=190000.0, ans=0.125 2023-09-29 00:24:39,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:39,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:24:39,585 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=190000.0, ans=0.0 2023-09-29 00:24:42,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:24:42,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:42,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:45,106 INFO [train.py:1039] (0/4) Epoch 6, batch 1950, loss[loss=0.2572, simple_loss=0.3179, pruned_loss=0.09826, over 23465.00 frames. ], tot_loss[loss=0.2399, simple_loss=0.3006, pruned_loss=0.08958, over 4687954.80 frames. ], batch size: 93, lr: 1.72e-02, grad_scale: 8.0 2023-09-29 00:24:45,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:49,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:24:51,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:24:51,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:51,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:24:52,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 00:24:54,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:24:54,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:56,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:58,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:24:59,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:24:59,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:01,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:01,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=190133.33333333334, ans=0.1 2023-09-29 00:25:06,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:25:08,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:25:08,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:25:08,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:11,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:14,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:25:14,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:14,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:25:14,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 00:25:14,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:25:14,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:25:16,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:20,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:22,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:25:26,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:25:30,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:25:30,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:25:32,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 00:25:32,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:25:36,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:40,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:25:40,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:25:46,170 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.214e+02 2.605e+02 2.904e+02 4.592e+02, threshold=5.209e+02, percent-clipped=0.0 2023-09-29 00:25:49,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:51,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:53,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:56,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:59,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:26:00,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:26:01,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 00:26:01,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:26:01,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:03,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 00:26:06,599 INFO [train.py:1039] (0/4) Epoch 6, batch 2000, loss[loss=0.2371, simple_loss=0.3132, pruned_loss=0.08047, over 24693.00 frames. ], tot_loss[loss=0.2406, simple_loss=0.3016, pruned_loss=0.08981, over 4702881.05 frames. ], batch size: 73, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:26:06,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:09,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:26:09,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:26:11,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:13,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:26:15,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:18,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 00:26:18,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:26:19,720 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=9.35 vs. limit=15.0 2023-09-29 00:26:23,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:26:24,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=190466.66666666666, ans=0.125 2023-09-29 00:26:24,533 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=15.0 2023-09-29 00:26:25,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 00:26:26,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:26:26,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:29,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:26:30,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 00:26:30,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=190466.66666666666, ans=0.125 2023-09-29 00:26:32,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:33,223 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=190466.66666666666, ans=0.1 2023-09-29 00:26:33,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=190466.66666666666, ans=0.0 2023-09-29 00:26:35,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 00:26:36,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:26:38,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 00:26:38,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:41,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:26:41,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:26:41,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:42,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:26:44,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:26:45,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 00:26:46,815 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.52 vs. limit=10.0 2023-09-29 00:26:49,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 00:26:49,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:50,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:55,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:56,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:26:57,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:26:58,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:59,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:00,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:00,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:27:01,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:02,686 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:27:03,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:06,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:27:08,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 00:27:15,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:27:15,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:17,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=190666.66666666666, ans=0.1 2023-09-29 00:27:18,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:18,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:27:21,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:23,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:23,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:25,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:27:25,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:27:29,347 INFO [train.py:1039] (0/4) Epoch 6, batch 2050, loss[loss=0.2364, simple_loss=0.3026, pruned_loss=0.08505, over 23403.00 frames. ], tot_loss[loss=0.2408, simple_loss=0.3018, pruned_loss=0.08989, over 4697129.43 frames. ], batch size: 93, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:27:29,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:30,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:32,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:32,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:38,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:40,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:27:40,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:41,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:27:45,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 00:27:45,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:27:47,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:47,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:27:56,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:27:56,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:28:00,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 00:28:03,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:28:04,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=190866.66666666666, ans=0.125 2023-09-29 00:28:04,572 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=17.81 vs. limit=15.0 2023-09-29 00:28:05,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 00:28:05,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:28:07,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:10,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:11,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:28:12,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:14,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:28:16,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:28:16,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:28:19,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:21,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:28:21,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=190933.33333333334, ans=0.1 2023-09-29 00:28:24,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:28:25,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:28:29,264 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.164e+02 2.484e+02 2.839e+02 4.579e+02, threshold=4.968e+02, percent-clipped=0.0 2023-09-29 00:28:29,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:35,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:28:35,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 00:28:41,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:42,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:28:44,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:28:45,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=191000.0, ans=0.0 2023-09-29 00:28:47,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 00:28:50,062 INFO [train.py:1039] (0/4) Epoch 6, batch 2100, loss[loss=0.2613, simple_loss=0.3069, pruned_loss=0.1079, over 23854.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.2994, pruned_loss=0.08991, over 4673727.02 frames. ], batch size: 195, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:28:50,300 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 00:28:50,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:28:50,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:51,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:28:53,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:53,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 00:28:54,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 00:28:54,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=191066.66666666666, ans=0.125 2023-09-29 00:28:56,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:59,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:28:59,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:29:03,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:05,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:29:05,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 00:29:05,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:29:07,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 00:29:07,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 00:29:08,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:09,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:09,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 00:29:10,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 00:29:15,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 00:29:15,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:29:19,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:29:21,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:29:22,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:29:24,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 00:29:24,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:24,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:29:26,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=191200.0, ans=0.2 2023-09-29 00:29:27,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 00:29:29,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:29,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 00:29:29,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 00:29:29,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 00:29:32,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:29:34,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:29:34,818 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.30 vs. limit=12.0 2023-09-29 00:29:36,341 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.98 vs. limit=15.0 2023-09-29 00:29:37,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:37,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:40,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:42,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:42,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 00:29:42,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:42,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:43,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:43,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 00:29:45,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 00:29:45,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 00:29:48,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:29:52,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:52,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 00:29:54,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=191333.33333333334, ans=0.125 2023-09-29 00:29:59,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:02,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:30:02,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:02,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:02,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:30:04,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:06,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:06,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:30:07,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:30:07,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:09,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 00:30:10,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 00:30:10,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:12,266 INFO [train.py:1039] (0/4) Epoch 6, batch 2150, loss[loss=0.249, simple_loss=0.2941, pruned_loss=0.1019, over 23810.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.2986, pruned_loss=0.08895, over 4691556.86 frames. ], batch size: 179, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:30:14,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:30:14,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:30:14,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:30:14,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:30:21,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:30:24,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:24,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:25,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:30:25,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:25,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:30:29,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:30,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:30:30,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:30:34,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:34,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 00:30:38,216 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.30 vs. limit=15.0 2023-09-29 00:30:39,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:40,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:30:41,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:42,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:30:44,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:44,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:45,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:46,393 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=191533.33333333334, ans=0.0 2023-09-29 00:30:47,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 00:30:47,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:30:49,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:49,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:51,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:52,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:30:54,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:54,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=191533.33333333334, ans=0.1 2023-09-29 00:30:55,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:30:57,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:57,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 00:30:57,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:30:59,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=191533.33333333334, ans=0.0 2023-09-29 00:31:00,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:00,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:02,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:02,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:31:03,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:05,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:05,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 00:31:05,485 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=191600.0, ans=0.125 2023-09-29 00:31:07,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 00:31:07,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:31:08,654 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 00:31:08,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:10,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:31:12,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 00:31:12,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:31:12,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 00:31:12,156 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 00:31:12,156 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 00:31:13,511 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.175e+02 2.371e+02 2.778e+02 4.132e+02, threshold=4.742e+02, percent-clipped=0.0 2023-09-29 00:31:13,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 00:31:15,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:15,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:31:16,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:31:16,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:18,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:31:19,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:19,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:25,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=191666.66666666666, ans=0.1 2023-09-29 00:31:30,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:31:31,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 00:31:34,636 INFO [train.py:1039] (0/4) Epoch 6, batch 2200, loss[loss=0.2276, simple_loss=0.3059, pruned_loss=0.07467, over 24382.00 frames. ], tot_loss[loss=0.2379, simple_loss=0.2985, pruned_loss=0.08868, over 4685292.18 frames. ], batch size: 77, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:31:34,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:31:39,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:40,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:31:40,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:31:44,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:31:46,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:46,844 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.28 vs. limit=6.0 2023-09-29 00:31:47,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:31:47,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 00:31:54,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 00:31:55,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:32:01,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 00:32:05,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:07,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:07,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:32:11,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:32:11,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 00:32:16,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:32:16,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:16,948 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=191866.66666666666, ans=0.2 2023-09-29 00:32:18,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 00:32:21,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:32:21,993 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=191933.33333333334, ans=0.0 2023-09-29 00:32:23,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:25,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:32:26,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:28,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 00:32:30,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:30,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 00:32:32,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:32,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:32:32,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:32,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=191933.33333333334, ans=0.025 2023-09-29 00:32:36,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:37,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:37,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:37,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:39,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:32:40,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:32:42,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:32:45,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:32:45,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:32:47,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:32:48,728 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 00:32:50,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:32:50,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=192000.0, ans=0.1 2023-09-29 00:32:51,719 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 00:32:51,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:32:51,913 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 00:32:53,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:55,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:32:56,883 INFO [train.py:1039] (0/4) Epoch 6, batch 2250, loss[loss=0.2552, simple_loss=0.3113, pruned_loss=0.09958, over 23662.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.2989, pruned_loss=0.08905, over 4678983.92 frames. ], batch size: 256, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:32:58,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:59,165 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 00:33:00,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:33:04,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:11,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:33:11,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:33:14,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:15,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:17,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:17,796 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=192133.33333333334, ans=0.1 2023-09-29 00:33:20,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 00:33:20,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:20,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:33:23,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 00:33:23,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:33:24,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:26,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:31,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:33,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:33:33,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:33:35,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 00:33:36,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:40,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:33:44,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:45,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:47,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:33:47,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:48,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:50,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:33:54,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:33:56,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:33:57,689 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 2.089e+02 2.370e+02 2.766e+02 4.098e+02, threshold=4.740e+02, percent-clipped=0.0 2023-09-29 00:34:02,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:34:04,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:34:04,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:34:09,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:34:13,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:34:13,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 00:34:14,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:14,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:34:18,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 00:34:18,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=192400.0, ans=0.125 2023-09-29 00:34:19,576 INFO [train.py:1039] (0/4) Epoch 6, batch 2300, loss[loss=0.1965, simple_loss=0.272, pruned_loss=0.06051, over 24299.00 frames. ], tot_loss[loss=0.2392, simple_loss=0.3, pruned_loss=0.08916, over 4689726.07 frames. ], batch size: 61, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:34:19,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:34:19,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:21,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=192400.0, ans=0.125 2023-09-29 00:34:25,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:25,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:34:27,422 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 00:34:27,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=192400.0, ans=0.025 2023-09-29 00:34:28,196 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=192400.0, ans=6.0 2023-09-29 00:34:30,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:38,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:34:38,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:34:38,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:34:40,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:40,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 00:34:41,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:34:41,800 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:34:42,123 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.88 vs. limit=15.0 2023-09-29 00:34:47,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:34:47,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:34:50,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:34:54,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:34:57,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:34:58,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=192533.33333333334, ans=0.0 2023-09-29 00:35:01,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:35:03,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:35:06,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:35:06,975 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.89 vs. limit=22.5 2023-09-29 00:35:07,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:35:11,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:35:11,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:35:12,386 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.91 vs. limit=15.0 2023-09-29 00:35:13,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:35:13,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 00:35:16,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:35:16,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:16,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:35:16,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:35:18,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:20,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:35:20,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:35:20,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 00:35:21,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:35:21,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:22,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 00:35:30,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:35:31,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:35:37,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:37,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:35:39,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:35:40,444 INFO [train.py:1039] (0/4) Epoch 6, batch 2350, loss[loss=0.2205, simple_loss=0.3004, pruned_loss=0.07033, over 24659.00 frames. ], tot_loss[loss=0.2399, simple_loss=0.3008, pruned_loss=0.08946, over 4692071.71 frames. ], batch size: 73, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:35:40,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:35:40,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:35:42,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:35:44,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 00:35:50,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:35:50,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 00:35:57,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 00:35:59,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:36:01,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:03,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:03,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 00:36:07,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:36:12,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 00:36:13,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:16,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:36:16,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:36:19,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:36:22,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 00:36:22,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:36:23,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:23,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:25,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:36:30,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:36:32,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 00:36:33,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:36:36,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:37,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:36:39,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 00:36:39,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:36:41,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 00:36:41,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:36:41,473 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.23 vs. limit=15.0 2023-09-29 00:36:42,239 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 2.125e+02 2.359e+02 2.805e+02 3.859e+02, threshold=4.718e+02, percent-clipped=0.0 2023-09-29 00:36:42,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=192933.33333333334, ans=0.0 2023-09-29 00:36:46,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 00:36:49,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 00:36:50,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:50,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:36:50,869 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 00:36:52,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 00:36:52,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 00:36:56,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:37:01,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:37:03,322 INFO [train.py:1039] (0/4) Epoch 6, batch 2400, loss[loss=0.2169, simple_loss=0.2832, pruned_loss=0.07525, over 24493.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3004, pruned_loss=0.0894, over 4704157.65 frames. ], batch size: 63, lr: 1.71e-02, grad_scale: 32.0 2023-09-29 00:37:06,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:37:08,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:37:10,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 00:37:10,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 00:37:12,603 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=193066.66666666666, ans=0.0 2023-09-29 00:37:17,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:37:17,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:37:20,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 00:37:22,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:37:23,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:23,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 00:37:30,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:32,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 00:37:37,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:37:40,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 00:37:43,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:37:45,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:48,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=193200.0, ans=0.125 2023-09-29 00:37:49,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:37:50,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 00:37:51,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:37:59,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:02,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:05,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:05,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:38:05,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:38:05,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:38:05,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:07,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:07,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:38:12,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:12,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:38:12,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 00:38:14,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 00:38:18,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:38:18,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:18,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 00:38:19,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 00:38:19,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 00:38:19,795 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 00:38:21,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 00:38:21,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:38:23,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:24,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:25,943 INFO [train.py:1039] (0/4) Epoch 6, batch 2450, loss[loss=0.2127, simple_loss=0.2823, pruned_loss=0.07153, over 24483.00 frames. ], tot_loss[loss=0.2387, simple_loss=0.2992, pruned_loss=0.0891, over 4695995.25 frames. ], batch size: 63, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:38:26,002 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 00:38:26,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:28,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:38:32,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:38:32,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:35,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:37,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:37,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 00:38:43,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:43,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:48,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:38:48,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:38:48,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:38:49,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 00:38:53,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:55,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:38:56,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:56,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=193533.33333333334, ans=0.125 2023-09-29 00:38:59,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:38:59,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:01,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:03,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:39:04,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 00:39:04,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:39:15,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:15,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:39:16,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:16,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:39:18,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:18,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:39:18,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 00:39:22,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=193600.0, ans=0.1 2023-09-29 00:39:23,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:23,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:39:26,707 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.234e+02 2.563e+02 3.066e+02 5.570e+02, threshold=5.125e+02, percent-clipped=5.0 2023-09-29 00:39:26,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:39:26,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:31,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:39:32,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 00:39:34,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:39:34,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:39:34,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 00:39:34,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:39:36,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:39:39,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:39:42,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:42,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:39:45,687 INFO [train.py:1039] (0/4) Epoch 6, batch 2500, loss[loss=0.2078, simple_loss=0.2657, pruned_loss=0.07494, over 23793.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.2978, pruned_loss=0.08818, over 4711119.85 frames. ], batch size: 150, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:39:46,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 00:39:47,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:39:52,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=193733.33333333334, ans=0.0 2023-09-29 00:39:54,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:39:54,712 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.39 vs. limit=22.5 2023-09-29 00:40:02,499 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.56 vs. limit=12.0 2023-09-29 00:40:04,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:40:05,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:40:07,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:40:07,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 00:40:11,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=193800.0, ans=0.125 2023-09-29 00:40:12,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:40:13,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=193800.0, ans=0.1 2023-09-29 00:40:14,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:15,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:40:15,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 00:40:15,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 00:40:17,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:18,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:18,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 00:40:20,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:20,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 00:40:21,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:25,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:40:26,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:30,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:40:30,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 00:40:32,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:40:33,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:37,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:42,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:45,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:40:49,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:40:52,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 00:40:52,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:52,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:40:55,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:40:55,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:40:56,674 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 00:40:56,675 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 00:40:56,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 00:40:59,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:41:03,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 00:41:03,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 00:41:04,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:41:04,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 00:41:07,217 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.57 vs. limit=10.0 2023-09-29 00:41:07,762 INFO [train.py:1039] (0/4) Epoch 6, batch 2550, loss[loss=0.2469, simple_loss=0.3233, pruned_loss=0.08524, over 24679.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2979, pruned_loss=0.08746, over 4725565.61 frames. ], batch size: 73, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:41:09,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 00:41:11,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:13,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:41:14,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:41:16,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:16,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 00:41:16,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=194066.66666666666, ans=0.125 2023-09-29 00:41:16,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=194066.66666666666, ans=0.0 2023-09-29 00:41:18,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:41:21,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 00:41:23,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:41:25,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:26,858 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.65 vs. limit=15.0 2023-09-29 00:41:27,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:41:27,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 00:41:27,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:41:27,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:29,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:41:30,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:41:30,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 00:41:32,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:41:32,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:32,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 00:41:35,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=194133.33333333334, ans=0.1 2023-09-29 00:41:45,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:41:51,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:41:51,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:51,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:51,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=194200.0, ans=0.0 2023-09-29 00:41:53,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:42:00,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:42:02,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:42:02,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:42:02,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:42:04,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:42:04,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:42:09,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:09,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:10,628 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.103e+02 2.352e+02 2.955e+02 4.902e+02, threshold=4.704e+02, percent-clipped=0.0 2023-09-29 00:42:14,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:42:14,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 00:42:14,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:42:15,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:17,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:42:18,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:42:18,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:26,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:42:27,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:30,638 INFO [train.py:1039] (0/4) Epoch 6, batch 2600, loss[loss=0.2342, simple_loss=0.2978, pruned_loss=0.08525, over 17012.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2987, pruned_loss=0.08721, over 4724311.41 frames. ], batch size: 36, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:42:32,268 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 00:42:35,196 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 00:42:35,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:42:35,286 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 00:42:35,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=194400.0, ans=0.125 2023-09-29 00:42:37,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 00:42:37,435 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 00:42:37,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=194400.0, ans=0.125 2023-09-29 00:42:39,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:39,229 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 00:42:41,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 00:42:42,824 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 00:42:43,157 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=194400.0, ans=0.125 2023-09-29 00:42:45,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:42:46,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=194466.66666666666, ans=0.125 2023-09-29 00:42:47,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 00:42:47,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 00:42:48,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:42:48,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 00:42:52,082 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 00:42:53,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 00:43:00,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=194466.66666666666, ans=0.0 2023-09-29 00:43:03,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:03,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:05,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:05,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 00:43:06,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:43:10,241 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=194533.33333333334, ans=0.0 2023-09-29 00:43:10,684 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.03 vs. limit=22.5 2023-09-29 00:43:12,029 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 00:43:12,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=194533.33333333334, ans=15.0 2023-09-29 00:43:12,699 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.50 vs. limit=15.0 2023-09-29 00:43:18,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:20,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:20,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 00:43:20,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:20,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:21,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 00:43:25,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:43:25,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:43:27,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:31,056 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 00:43:32,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:32,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:43:36,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=194666.66666666666, ans=0.05 2023-09-29 00:43:40,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:41,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:43:41,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 00:43:41,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:44,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:43:44,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:43:52,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 00:43:52,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:53,558 INFO [train.py:1039] (0/4) Epoch 6, batch 2650, loss[loss=0.2709, simple_loss=0.3157, pruned_loss=0.1131, over 23774.00 frames. ], tot_loss[loss=0.2369, simple_loss=0.299, pruned_loss=0.08737, over 4735419.36 frames. ], batch size: 232, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:43:53,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:43:58,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 00:43:58,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:59,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:44:01,188 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 00:44:01,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:04,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:08,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:44:10,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:44:12,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:44:14,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 00:44:14,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:44:15,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:44:17,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 00:44:18,604 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 00:44:21,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:21,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 00:44:23,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:23,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 00:44:27,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:44:29,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:34,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 00:44:34,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 00:44:36,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:44:39,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 00:44:39,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:41,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:41,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:44:43,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:43,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:45,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=194933.33333333334, ans=0.07 2023-09-29 00:44:46,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:48,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=194933.33333333334, ans=0.0 2023-09-29 00:44:49,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:49,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:49,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:44:51,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:44:52,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:54,156 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.176e+02 2.610e+02 3.276e+02 6.463e+02, threshold=5.220e+02, percent-clipped=8.0 2023-09-29 00:44:54,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:44:54,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:56,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:56,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:45:03,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:03,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:45:03,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:03,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 00:45:06,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:08,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:08,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=195000.0, ans=0.2 2023-09-29 00:45:09,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:11,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:11,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=195000.0, ans=0.1 2023-09-29 00:45:12,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:45:12,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:13,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=195066.66666666666, ans=0.0 2023-09-29 00:45:14,215 INFO [train.py:1039] (0/4) Epoch 6, batch 2700, loss[loss=0.2086, simple_loss=0.2819, pruned_loss=0.06759, over 24328.00 frames. ], tot_loss[loss=0.2385, simple_loss=0.3004, pruned_loss=0.08833, over 4726058.46 frames. ], batch size: 61, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:45:15,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:15,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 00:45:19,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:45:21,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 00:45:22,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:45:22,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:22,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:24,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:45:24,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:24,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:45:24,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:45:24,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 00:45:24,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:45:26,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:45:28,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:45:29,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:33,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:45:34,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 00:45:36,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:45:36,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=195133.33333333334, ans=0.0 2023-09-29 00:45:42,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:45:42,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:45:48,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:45:48,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:48,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:45:48,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:45:51,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=195200.0, ans=0.1 2023-09-29 00:45:53,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:45:57,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:57,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:45:57,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:45:59,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=195200.0, ans=0.2 2023-09-29 00:46:01,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:01,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:46:02,692 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.61 vs. limit=15.0 2023-09-29 00:46:10,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:46:12,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:15,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:46:15,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:15,509 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=195266.66666666666, ans=0.1 2023-09-29 00:46:18,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:19,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:19,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:46:21,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:22,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:23,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:26,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:46:26,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:26,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:29,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 00:46:29,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:33,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:46:33,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 00:46:33,879 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.14 vs. limit=22.5 2023-09-29 00:46:36,438 INFO [train.py:1039] (0/4) Epoch 6, batch 2750, loss[loss=0.25, simple_loss=0.2874, pruned_loss=0.1063, over 22633.00 frames. ], tot_loss[loss=0.2383, simple_loss=0.3003, pruned_loss=0.08818, over 4739824.33 frames. ], batch size: 322, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:46:36,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 00:46:36,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:40,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:46:40,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:41,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:41,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:46:42,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:43,687 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=195400.0, ans=0.015 2023-09-29 00:46:44,180 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.85 vs. limit=15.0 2023-09-29 00:46:45,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:46:45,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:46:46,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:46:46,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:46,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 00:46:46,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:46:46,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:54,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 00:46:55,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:55,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:55,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:57,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:46:59,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:47:00,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:47:01,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:01,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:03,342 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=195466.66666666666, ans=0.0 2023-09-29 00:47:05,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:47:05,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:47:07,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:47:07,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:10,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:47:18,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:20,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:47:20,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:25,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=195600.0, ans=0.1 2023-09-29 00:47:26,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:26,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:47:26,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:47:33,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:47:33,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:47:33,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 00:47:38,283 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 2.212e+02 2.511e+02 3.083e+02 4.520e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-29 00:47:39,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:41,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 00:47:48,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:47:50,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:47:50,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 00:47:52,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:47:53,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:47:53,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 00:47:53,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:47:56,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:47:58,243 INFO [train.py:1039] (0/4) Epoch 6, batch 2800, loss[loss=0.2543, simple_loss=0.2961, pruned_loss=0.1063, over 23837.00 frames. ], tot_loss[loss=0.2372, simple_loss=0.2985, pruned_loss=0.08793, over 4714087.77 frames. ], batch size: 195, lr: 1.70e-02, grad_scale: 32.0 2023-09-29 00:47:58,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:47:58,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:47:59,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 00:47:59,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:00,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:03,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:03,149 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 00:48:03,150 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 00:48:08,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:09,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:48:09,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:48:14,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:48:15,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 00:48:18,054 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.58 vs. limit=10.0 2023-09-29 00:48:19,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:48:20,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 00:48:21,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:22,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:48:22,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:23,000 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=195800.0, ans=10.0 2023-09-29 00:48:24,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:26,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:26,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:48:27,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:48:35,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:48:37,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:38,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:40,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:48:42,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:47,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:48:47,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 00:48:49,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:49,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:49,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:48:56,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:57,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:59,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:49:01,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:49:03,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:03,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:49:03,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:49:04,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:49:06,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:49:06,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 00:49:06,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:07,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:49:07,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:09,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 00:49:10,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:10,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:49:12,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:49:13,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 00:49:19,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:49:19,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:49:19,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:49:21,323 INFO [train.py:1039] (0/4) Epoch 6, batch 2850, loss[loss=0.2307, simple_loss=0.3101, pruned_loss=0.07569, over 24584.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.2976, pruned_loss=0.08728, over 4699322.32 frames. ], batch size: 71, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:49:23,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:23,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=196066.66666666666, ans=0.2 2023-09-29 00:49:26,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:49:26,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:49:26,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:49:29,040 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.95 vs. limit=6.0 2023-09-29 00:49:29,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:29,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:32,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:49:33,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 00:49:33,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=196066.66666666666, ans=0.0 2023-09-29 00:49:39,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 00:49:39,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:41,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 00:49:41,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:44,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 00:49:44,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 00:49:44,767 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.34 vs. limit=6.0 2023-09-29 00:49:47,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:59,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:59,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:50:01,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:50:01,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:50:01,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:50:01,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:50:02,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:50:02,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 00:50:07,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:50:07,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:07,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:09,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:12,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:15,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:50:15,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:50:17,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:18,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:21,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:50:24,239 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.055e+02 2.329e+02 2.690e+02 4.548e+02, threshold=4.658e+02, percent-clipped=0.0 2023-09-29 00:50:27,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:50:31,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 00:50:31,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 00:50:32,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:50:32,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:32,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 00:50:33,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=196333.33333333334, ans=0.0 2023-09-29 00:50:34,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:50:34,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:34,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:34,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:50:34,515 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 00:50:34,586 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 00:50:34,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:35,368 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.83 vs. limit=12.0 2023-09-29 00:50:36,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:38,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=196333.33333333334, ans=0.0 2023-09-29 00:50:42,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:50:42,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:42,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:44,194 INFO [train.py:1039] (0/4) Epoch 6, batch 2900, loss[loss=0.2436, simple_loss=0.3121, pruned_loss=0.08753, over 24033.00 frames. ], tot_loss[loss=0.2361, simple_loss=0.298, pruned_loss=0.0871, over 4704080.57 frames. ], batch size: 86, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:50:44,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 00:50:46,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=196400.0, ans=12.0 2023-09-29 00:50:47,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:47,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 00:50:47,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 00:50:50,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:50:50,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:50:52,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:55,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:56,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=196400.0, ans=0.0 2023-09-29 00:50:59,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:59,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:51:02,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:51:02,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 00:51:04,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:51:06,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:09,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 00:51:10,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 00:51:14,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:51:14,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 00:51:14,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:51:17,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:51:17,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:51:18,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=196533.33333333334, ans=0.0 2023-09-29 00:51:18,127 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=196533.33333333334, ans=0.0 2023-09-29 00:51:20,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:51:20,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:23,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:51:25,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:27,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 00:51:27,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 00:51:27,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:51:33,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:51:36,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 00:51:36,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:51:43,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:49,968 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=196666.66666666666, ans=0.125 2023-09-29 00:51:52,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:51:52,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:51:54,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 00:51:56,123 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=196666.66666666666, ans=0.125 2023-09-29 00:51:57,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:57,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 00:51:58,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:00,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:52:05,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:07,279 INFO [train.py:1039] (0/4) Epoch 6, batch 2950, loss[loss=0.2476, simple_loss=0.3154, pruned_loss=0.08984, over 24034.00 frames. ], tot_loss[loss=0.2377, simple_loss=0.2997, pruned_loss=0.08786, over 4702467.83 frames. ], batch size: 86, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:52:08,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 00:52:08,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:08,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:11,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:12,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:52:14,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 00:52:14,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 00:52:14,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:52:14,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:19,120 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.73 vs. limit=15.0 2023-09-29 00:52:20,869 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.55 vs. limit=15.0 2023-09-29 00:52:21,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:23,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:24,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:52:24,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:28,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:52:29,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:52:29,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=196800.0, ans=0.1 2023-09-29 00:52:30,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:52:34,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 00:52:38,130 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=196800.0, ans=0.125 2023-09-29 00:52:40,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 00:52:42,917 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 00:52:43,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:52:45,975 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 00:52:46,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 00:52:46,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:47,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:47,553 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 00:52:47,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:52:49,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 00:52:51,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:51,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:52:54,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:56,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:52:56,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:52:58,027 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 00:52:58,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:58,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=196933.33333333334, ans=0.0 2023-09-29 00:52:59,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 00:53:01,786 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=196933.33333333334, ans=22.5 2023-09-29 00:53:05,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:07,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:07,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 00:53:07,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:53:10,062 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.213e+02 2.464e+02 2.740e+02 4.622e+02, threshold=4.928e+02, percent-clipped=0.0 2023-09-29 00:53:10,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 00:53:11,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:15,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:53:15,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:53:15,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:15,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:53:15,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=197000.0, ans=0.125 2023-09-29 00:53:17,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:53:19,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:19,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:53:19,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:53:20,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:20,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:53:22,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:22,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 00:53:24,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:26,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:53:27,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:53:28,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=197000.0, ans=0.125 2023-09-29 00:53:30,689 INFO [train.py:1039] (0/4) Epoch 6, batch 3000, loss[loss=0.2485, simple_loss=0.3055, pruned_loss=0.09578, over 23374.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.3, pruned_loss=0.08813, over 4713751.84 frames. ], batch size: 119, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:53:30,690 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 00:53:42,248 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.2.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.8401, 4.4867, 4.9028, 4.7197], device='cuda:0') 2023-09-29 00:53:42,607 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([6.6857, 6.3342, 6.4795, 6.5457], device='cuda:0') 2023-09-29 00:53:45,518 INFO [train.py:1071] (0/4) Epoch 6, validation: loss=0.3825, simple_loss=0.3275, pruned_loss=0.2187, over 1125622.00 frames. 2023-09-29 00:53:45,519 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 00:53:47,217 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 00:53:47,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 00:53:49,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=197066.66666666666, ans=0.125 2023-09-29 00:53:50,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:50,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:53:51,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 00:53:51,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:53:53,266 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.58 vs. limit=12.0 2023-09-29 00:54:00,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:54:05,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=197133.33333333334, ans=0.1 2023-09-29 00:54:08,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:54:11,210 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.36 vs. limit=15.0 2023-09-29 00:54:14,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 00:54:17,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:54:18,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=197200.0, ans=0.1 2023-09-29 00:54:19,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:54:19,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:54:21,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:54:23,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:23,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 00:54:26,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 00:54:28,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:54:28,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:54:30,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:54:30,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:32,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:32,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:54:32,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=197200.0, ans=0.0 2023-09-29 00:54:37,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:54:37,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:37,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:54:39,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:41,605 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.16 vs. limit=15.0 2023-09-29 00:54:42,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 00:54:42,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:54:42,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:54:44,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:54:48,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:48,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:50,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:54:50,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 00:54:50,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:54:50,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 00:54:51,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:54:53,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 00:54:57,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:54:57,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 00:54:57,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 00:54:59,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 00:54:59,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:55:00,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:55:02,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:55:02,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:55:02,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:03,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:55:06,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=197333.33333333334, ans=0.125 2023-09-29 00:55:07,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 00:55:09,443 INFO [train.py:1039] (0/4) Epoch 6, batch 3050, loss[loss=0.2065, simple_loss=0.268, pruned_loss=0.07248, over 24475.00 frames. ], tot_loss[loss=0.24, simple_loss=0.3015, pruned_loss=0.08926, over 4715775.78 frames. ], batch size: 58, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:55:09,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:12,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:12,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:55:17,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:20,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 00:55:23,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=197466.66666666666, ans=0.0 2023-09-29 00:55:26,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 00:55:28,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 00:55:28,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:29,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=197466.66666666666, ans=0.1 2023-09-29 00:55:29,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=197466.66666666666, ans=0.0 2023-09-29 00:55:31,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:55:34,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:34,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:36,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:38,695 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:55:40,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:55:41,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:55:41,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:41,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:41,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:43,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:47,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:48,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:49,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 00:55:50,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:50,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:55:53,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:55,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:55:56,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:55:56,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:55:58,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=197600.0, ans=0.125 2023-09-29 00:56:01,694 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=24.11 vs. limit=22.5 2023-09-29 00:56:02,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:56:04,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:10,216 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.122e+02 2.325e+02 2.738e+02 3.532e+02, threshold=4.649e+02, percent-clipped=0.0 2023-09-29 00:56:10,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:10,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:56:10,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:56:12,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:12,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:56:14,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:56:15,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 00:56:17,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:17,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:19,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 00:56:23,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:27,223 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=15.0 2023-09-29 00:56:29,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:29,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=197733.33333333334, ans=0.0 2023-09-29 00:56:31,032 INFO [train.py:1039] (0/4) Epoch 6, batch 3100, loss[loss=0.2377, simple_loss=0.2761, pruned_loss=0.09963, over 22643.00 frames. ], tot_loss[loss=0.2387, simple_loss=0.3007, pruned_loss=0.08838, over 4725692.27 frames. ], batch size: 322, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:56:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:56:34,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:56:35,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 00:56:37,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 00:56:40,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 00:56:42,464 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=12.37 vs. limit=10.0 2023-09-29 00:56:43,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:56:44,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:56:44,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:48,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:56:52,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:00,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 00:57:04,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:57:04,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:05,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:07,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:07,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 00:57:09,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:57:10,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 00:57:10,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:57:11,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:13,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 00:57:13,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:57:17,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:57:17,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 00:57:20,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 00:57:20,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:20,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:25,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:25,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:25,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:57:26,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:57:26,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:57:29,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:57:29,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:57:29,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:29,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 00:57:34,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:36,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 00:57:39,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:57:39,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 00:57:39,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:41,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:41,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 00:57:49,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 00:57:51,984 INFO [train.py:1039] (0/4) Epoch 6, batch 3150, loss[loss=0.2377, simple_loss=0.2654, pruned_loss=0.105, over 19382.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.2984, pruned_loss=0.08762, over 4716243.19 frames. ], batch size: 388, lr: 1.69e-02, grad_scale: 16.0 2023-09-29 00:57:52,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:57:54,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:55,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:55,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:57:56,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 00:57:59,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:00,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:58:02,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 00:58:02,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:05,779 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 00:58:09,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 00:58:09,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:58:10,976 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 00:58:11,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:58:12,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 00:58:14,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 00:58:14,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 00:58:14,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:14,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:15,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:17,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 00:58:20,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:23,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:58:26,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 00:58:28,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:58:31,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:58:31,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:33,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 00:58:35,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 00:58:36,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:58:36,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:58:36,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:58:36,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:58:36,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:58:39,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:58:39,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:58:40,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 00:58:42,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:58:42,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:43,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:58:43,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:45,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 00:58:45,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=198266.66666666666, ans=0.05 2023-09-29 00:58:46,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:48,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 00:58:48,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:48,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 00:58:49,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 00:58:52,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:58:52,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:54,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 00:58:55,606 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.072e+02 2.389e+02 2.889e+02 3.902e+02, threshold=4.779e+02, percent-clipped=0.0 2023-09-29 00:58:55,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:58:55,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:59:00,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:59:01,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:02,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:59:07,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:59:07,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:12,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:59:14,228 INFO [train.py:1039] (0/4) Epoch 6, batch 3200, loss[loss=0.2164, simple_loss=0.2938, pruned_loss=0.0695, over 24497.00 frames. ], tot_loss[loss=0.2348, simple_loss=0.2965, pruned_loss=0.08659, over 4717659.04 frames. ], batch size: 66, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 00:59:18,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:59:18,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:59:21,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:23,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:59:23,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 00:59:24,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:59:29,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:59:35,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:35,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=198466.66666666666, ans=0.2 2023-09-29 00:59:40,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=198466.66666666666, ans=0.125 2023-09-29 00:59:44,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:59:49,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=198533.33333333334, ans=0.2 2023-09-29 00:59:55,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 00:59:56,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:00:00,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 01:00:00,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:00:01,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=198600.0, ans=0.0 2023-09-29 01:00:04,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:00:04,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:00:05,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:00:09,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 01:00:10,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 01:00:12,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 01:00:13,165 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=198600.0, ans=0.1 2023-09-29 01:00:18,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 01:00:19,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:00:26,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:00:26,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,781 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 01:00:26,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:00:31,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:00:33,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 01:00:34,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 01:00:36,062 INFO [train.py:1039] (0/4) Epoch 6, batch 3250, loss[loss=0.258, simple_loss=0.2888, pruned_loss=0.1136, over 19170.00 frames. ], tot_loss[loss=0.2348, simple_loss=0.2966, pruned_loss=0.08645, over 4714235.59 frames. ], batch size: 388, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:00:36,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 01:00:36,599 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=198733.33333333334, ans=0.05 2023-09-29 01:00:37,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 01:00:39,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:00:42,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:00:42,486 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 01:00:42,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:00:42,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:00:42,704 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 01:00:48,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:00:51,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:00:55,377 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=198800.0, ans=0.125 2023-09-29 01:01:00,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:00,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 01:01:02,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:02,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:02,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:02,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=198800.0, ans=0.0 2023-09-29 01:01:05,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:05,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:01:06,062 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.77 vs. limit=15.0 2023-09-29 01:01:08,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:08,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:01:08,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:09,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:09,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:09,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:10,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=198866.66666666666, ans=0.1 2023-09-29 01:01:13,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:16,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:17,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:17,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:19,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:19,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:19,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:23,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 01:01:24,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:01:24,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:01:26,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:27,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:01:33,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:01:37,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=198933.33333333334, ans=0.125 2023-09-29 01:01:41,422 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.144e+02 2.444e+02 2.943e+02 3.918e+02, threshold=4.889e+02, percent-clipped=0.0 2023-09-29 01:01:41,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:01:42,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:42,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 01:01:43,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:01:43,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:01:44,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:47,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 01:01:47,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 01:01:47,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:47,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=199000.0, ans=0.1 2023-09-29 01:01:49,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:50,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:52,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:01:52,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:56,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:56,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:57,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 01:01:57,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:01:59,084 INFO [train.py:1039] (0/4) Epoch 6, batch 3300, loss[loss=0.2269, simple_loss=0.298, pruned_loss=0.07795, over 23980.00 frames. ], tot_loss[loss=0.2352, simple_loss=0.2971, pruned_loss=0.08661, over 4716627.20 frames. ], batch size: 80, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:02:00,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:02:00,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 01:02:02,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:02:04,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 01:02:07,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 01:02:07,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 01:02:07,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:09,992 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.65 vs. limit=10.0 2023-09-29 01:02:12,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:02:13,343 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.96 vs. limit=15.0 2023-09-29 01:02:14,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:02:14,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:17,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:02:17,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:02:19,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:20,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:02:25,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 01:02:25,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:25,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:27,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:28,764 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 01:02:30,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:02:31,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:02:31,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:02:31,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:02:31,957 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 01:02:36,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:36,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:02:38,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:40,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 01:02:41,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:02:41,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:42,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:02:42,620 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:02:43,774 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 01:02:44,278 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.87 vs. limit=15.0 2023-09-29 01:02:47,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 01:02:47,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:02:50,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 01:02:51,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:02:55,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:02:56,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:02:58,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:58,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:58,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:58,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:03:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:03:02,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:02,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=199266.66666666666, ans=0.04949747468305833 2023-09-29 01:03:03,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:03:05,021 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 01:03:06,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 01:03:09,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:03:09,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:09,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:13,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:03:13,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:14,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:03:14,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:14,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:03:16,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:17,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:03:21,881 INFO [train.py:1039] (0/4) Epoch 6, batch 3350, loss[loss=0.2116, simple_loss=0.2895, pruned_loss=0.06679, over 24465.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.2979, pruned_loss=0.08591, over 4731441.08 frames. ], batch size: 63, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:03:22,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 01:03:22,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:22,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=199400.0, ans=0.125 2023-09-29 01:03:23,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:25,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:03:25,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:03:28,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:28,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=199400.0, ans=0.0 2023-09-29 01:03:29,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:29,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:32,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:03:34,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:34,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:03:37,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:40,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:03:40,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:42,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:03:43,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 01:03:45,982 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 01:03:46,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:49,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 01:03:49,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 01:03:50,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:03:50,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:03:52,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:52,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 01:03:52,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:52,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:03:55,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:57,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:57,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:00,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:04:03,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:05,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:06,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:09,103 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-09-29 01:04:09,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:04:11,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:13,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:13,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:16,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:18,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 01:04:18,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:04:18,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 01:04:19,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:04:19,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 01:04:21,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:23,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:25,668 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.082e+02 2.289e+02 2.624e+02 4.671e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 01:04:31,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:32,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 01:04:33,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:04:33,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=199666.66666666666, ans=0.125 2023-09-29 01:04:34,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:04:34,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:04:39,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:04:42,777 INFO [train.py:1039] (0/4) Epoch 6, batch 3400, loss[loss=0.2422, simple_loss=0.2924, pruned_loss=0.09601, over 22798.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2994, pruned_loss=0.08694, over 4724291.92 frames. ], batch size: 322, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:04:42,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 01:04:42,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:04:43,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:04:44,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=199733.33333333334, ans=0.125 2023-09-29 01:04:45,251 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.12 vs. limit=15.0 2023-09-29 01:04:45,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:46,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 01:04:47,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:47,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 01:04:48,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:48,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:49,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:04:50,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:04:51,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 01:04:55,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 01:04:55,735 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 01:04:55,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:01,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:05:01,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:05:01,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:02,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:05:07,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:09,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 01:05:13,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:05:16,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:17,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:17,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:05:24,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:05:29,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 01:05:35,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:37,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:37,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 01:05:37,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:05:39,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:39,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:41,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:05:44,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:45,309 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=199933.33333333334, ans=0.0 2023-09-29 01:05:48,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:05:48,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:05:57,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:05:58,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 01:06:02,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:06:06,489 INFO [train.py:1039] (0/4) Epoch 6, batch 3450, loss[loss=0.3068, simple_loss=0.3322, pruned_loss=0.1407, over 19815.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2993, pruned_loss=0.08677, over 4725937.90 frames. ], batch size: 388, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:06:06,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 01:06:09,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 01:06:11,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:13,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:06:13,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 01:06:14,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:06:15,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=200066.66666666666, ans=0.125 2023-09-29 01:06:17,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:06:18,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=200066.66666666666, ans=0.1 2023-09-29 01:06:23,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:06:24,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:26,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:06:26,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:28,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:34,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 01:06:40,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 01:06:40,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:06:40,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:06:42,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:42,720 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.97 vs. limit=15.0 2023-09-29 01:06:45,626 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=200200.0, ans=0.1 2023-09-29 01:06:49,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 01:06:50,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:06:54,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:06:54,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:56,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:06:58,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:06:59,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 01:06:59,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:07:03,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:07:04,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:06,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 01:07:08,960 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.96 vs. limit=15.0 2023-09-29 01:07:11,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:07:12,876 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.101e+02 2.630e+02 3.255e+02 5.395e+02, threshold=5.260e+02, percent-clipped=4.0 2023-09-29 01:07:14,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:07:16,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:19,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:24,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:24,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:07:25,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:07:25,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:07:30,108 INFO [train.py:1039] (0/4) Epoch 6, batch 3500, loss[loss=0.2424, simple_loss=0.3122, pruned_loss=0.08628, over 24062.00 frames. ], tot_loss[loss=0.2352, simple_loss=0.298, pruned_loss=0.08618, over 4724632.54 frames. ], batch size: 80, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:07:30,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:35,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:07:35,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 01:07:36,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:07:38,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=200400.0, ans=0.125 2023-09-29 01:07:39,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:07:41,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:41,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 01:07:44,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:07:47,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:49,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:07:49,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:07:49,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:07:50,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:50,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:07:51,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 01:07:55,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:55,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:07:58,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:01,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:03,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 01:08:03,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:08:07,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:07,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:08:08,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:10,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:08:10,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:12,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 01:08:12,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 01:08:13,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 01:08:13,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:15,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:15,506 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=200533.33333333334, ans=0.0 2023-09-29 01:08:16,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:16,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:08:19,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:08:21,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:08:27,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:08:28,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 01:08:28,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 01:08:28,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:08:32,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:33,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:35,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:39,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 01:08:40,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:42,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:43,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 01:08:45,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 01:08:47,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:47,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=200666.66666666666, ans=0.125 2023-09-29 01:08:48,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:48,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:08:48,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:08:50,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=200666.66666666666, ans=0.0 2023-09-29 01:08:53,106 INFO [train.py:1039] (0/4) Epoch 6, batch 3550, loss[loss=0.2405, simple_loss=0.2916, pruned_loss=0.09477, over 23642.00 frames. ], tot_loss[loss=0.2335, simple_loss=0.2965, pruned_loss=0.08528, over 4729926.66 frames. ], batch size: 232, lr: 1.68e-02, grad_scale: 8.0 2023-09-29 01:08:53,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:08:56,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=200733.33333333334, ans=0.95 2023-09-29 01:09:02,275 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=200733.33333333334, ans=15.0 2023-09-29 01:09:02,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:05,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 01:09:07,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:09,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:09:13,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:14,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:09:14,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:09:16,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:16,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:09:16,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=200800.0, ans=0.125 2023-09-29 01:09:17,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:17,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:09:19,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:09:24,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:09:24,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:27,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:27,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:28,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:09:28,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 01:09:28,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:30,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:09:36,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:36,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:39,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:40,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 01:09:42,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:09:42,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 01:09:44,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:45,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:09:46,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:09:49,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=200933.33333333334, ans=0.125 2023-09-29 01:09:50,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 01:09:51,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:09:59,417 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.123e+02 2.427e+02 3.037e+02 5.186e+02, threshold=4.854e+02, percent-clipped=0.0 2023-09-29 01:09:59,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:09:59,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=201000.0, ans=0.0 2023-09-29 01:10:01,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 01:10:01,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:04,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:10:04,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 01:10:11,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 01:10:12,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:10:12,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:10:14,579 INFO [train.py:1039] (0/4) Epoch 6, batch 3600, loss[loss=0.2447, simple_loss=0.2974, pruned_loss=0.09601, over 23842.00 frames. ], tot_loss[loss=0.2343, simple_loss=0.2968, pruned_loss=0.08596, over 4722543.08 frames. ], batch size: 212, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:10:14,955 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=201066.66666666666, ans=0.125 2023-09-29 01:10:16,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:16,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:18,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:10:23,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:25,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:25,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:10:25,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:10:26,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:26,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 01:10:31,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:10:32,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:37,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:37,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:39,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:10:40,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:40,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 01:10:42,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:45,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:47,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:10:47,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=201200.0, ans=0.0 2023-09-29 01:10:48,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:10:51,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:51,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:10:53,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 01:10:57,892 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.02 vs. limit=10.0 2023-09-29 01:11:02,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:03,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:11:03,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 01:11:08,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:11:14,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:16,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:22,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=201333.33333333334, ans=0.125 2023-09-29 01:11:24,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:11:25,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:11:25,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 01:11:27,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 01:11:27,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 01:11:30,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:11:32,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:11:34,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 01:11:34,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:11:35,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:11:35,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:35,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 01:11:35,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 01:11:36,100 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=201333.33333333334, ans=0.125 2023-09-29 01:11:38,732 INFO [train.py:1039] (0/4) Epoch 6, batch 3650, loss[loss=0.236, simple_loss=0.2869, pruned_loss=0.09254, over 23772.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.2967, pruned_loss=0.08559, over 4720475.83 frames. ], batch size: 164, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:11:40,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:41,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 01:11:46,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 01:11:46,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:11:51,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 01:11:51,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 01:11:53,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=201466.66666666666, ans=0.125 2023-09-29 01:11:56,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:11:56,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:11:58,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:12:01,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:12:01,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:12:03,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 01:12:03,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:12:03,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:05,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 01:12:06,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:12:06,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=201466.66666666666, ans=0.0 2023-09-29 01:12:07,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:07,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:09,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:12:12,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 01:12:12,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 01:12:14,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:12:15,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 01:12:17,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:17,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:12:21,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:12:25,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:25,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:12:26,100 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.53 vs. limit=15.0 2023-09-29 01:12:26,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:12:27,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:12:30,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:12:32,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:32,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=201600.0, ans=0.09899494936611666 2023-09-29 01:12:33,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:33,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:35,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:12:36,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:37,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:37,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=201600.0, ans=0.07 2023-09-29 01:12:44,479 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 01:12:46,261 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=201666.66666666666, ans=0.0 2023-09-29 01:12:47,284 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.133e+02 2.417e+02 2.802e+02 4.868e+02, threshold=4.835e+02, percent-clipped=1.0 2023-09-29 01:12:50,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:50,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:51,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:12:51,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:51,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:12:53,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:55,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 01:12:55,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:55,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=201666.66666666666, ans=0.04949747468305833 2023-09-29 01:12:58,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:12:59,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:13:01,767 INFO [train.py:1039] (0/4) Epoch 6, batch 3700, loss[loss=0.3154, simple_loss=0.3443, pruned_loss=0.1433, over 19245.00 frames. ], tot_loss[loss=0.2356, simple_loss=0.2979, pruned_loss=0.08662, over 4715522.61 frames. ], batch size: 388, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:13:01,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:13:03,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:03,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 01:13:03,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:13:03,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=201733.33333333334, ans=0.125 2023-09-29 01:13:05,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:13:06,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:13:10,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:13:13,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:14,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:15,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:13:17,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:17,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:13:17,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:19,243 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 01:13:28,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:13:28,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:13:28,586 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=201800.0, ans=0.0 2023-09-29 01:13:29,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:13:29,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 01:13:29,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:35,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:36,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 01:13:38,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:39,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:13:42,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:42,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:13:43,705 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.35 vs. limit=15.0 2023-09-29 01:13:44,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:13:48,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:48,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 01:13:48,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:48,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 01:13:50,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=201933.33333333334, ans=0.0 2023-09-29 01:13:53,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:13:54,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:13:59,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:59,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 01:14:02,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:14:02,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:14:02,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:02,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:02,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=201933.33333333334, ans=0.0 2023-09-29 01:14:07,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:07,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 01:14:09,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 01:14:09,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:14:10,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:12,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:14:14,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:14:15,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:14:17,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:14:18,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:14:20,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 01:14:23,837 INFO [train.py:1039] (0/4) Epoch 6, batch 3750, loss[loss=0.2489, simple_loss=0.298, pruned_loss=0.09991, over 22706.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.298, pruned_loss=0.08627, over 4725059.73 frames. ], batch size: 322, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:14:24,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:14:25,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:14:27,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 01:14:27,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:14:29,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:30,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:30,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:14:35,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:38,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:14:40,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:14:44,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:48,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:14:50,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 01:14:50,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:14:51,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:14:53,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:55,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 01:15:00,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 01:15:01,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:15:01,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:15:04,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:08,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:10,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:15:16,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 01:15:18,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:22,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:15:22,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:15:26,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:15:28,572 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=202333.33333333334, ans=0.125 2023-09-29 01:15:28,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=202333.33333333334, ans=0.0 2023-09-29 01:15:30,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:15:31,936 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.301e+02 2.601e+02 3.130e+02 4.781e+02, threshold=5.202e+02, percent-clipped=0.0 2023-09-29 01:15:32,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:15:35,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:15:36,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:15:39,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:15:46,255 INFO [train.py:1039] (0/4) Epoch 6, batch 3800, loss[loss=0.2412, simple_loss=0.2984, pruned_loss=0.09199, over 23615.00 frames. ], tot_loss[loss=0.2352, simple_loss=0.2982, pruned_loss=0.08611, over 4730028.34 frames. ], batch size: 134, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:15:48,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:15:48,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=202400.0, ans=0.125 2023-09-29 01:15:53,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:55,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:15:55,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 01:15:57,334 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=202400.0, ans=0.2 2023-09-29 01:15:58,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:58,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:00,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:16:01,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:16:01,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:03,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:16:04,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:16:05,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:16:05,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:07,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 01:16:10,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 01:16:10,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:16:13,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:15,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:16:17,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:16:19,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:16:19,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:20,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:22,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:28,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:16:28,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 01:16:29,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=202533.33333333334, ans=0.125 2023-09-29 01:16:30,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:35,449 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=202600.0, ans=0.125 2023-09-29 01:16:38,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:16:43,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:16:46,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 01:16:48,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 01:16:48,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:50,363 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.81 vs. limit=15.0 2023-09-29 01:16:51,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:51,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:55,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 01:16:58,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 01:17:00,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 01:17:00,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:01,752 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.74 vs. limit=22.5 2023-09-29 01:17:02,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:17:07,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:17:08,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:17:09,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=202733.33333333334, ans=0.1 2023-09-29 01:17:10,317 INFO [train.py:1039] (0/4) Epoch 6, batch 3850, loss[loss=0.2326, simple_loss=0.2862, pruned_loss=0.08954, over 23516.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2972, pruned_loss=0.0855, over 4731177.18 frames. ], batch size: 134, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:17:14,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:17:15,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 01:17:18,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:17:18,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:23,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:17:24,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:27,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:17:27,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 01:17:28,493 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=20.44 vs. limit=15.0 2023-09-29 01:17:35,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:37,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:40,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:40,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:17:43,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:43,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:17:44,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:44,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:17:46,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:49,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:51,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:51,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:17:51,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 01:17:51,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 01:17:51,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:51,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:54,669 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=202866.66666666666, ans=0.125 2023-09-29 01:17:55,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:17:55,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:57,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 01:17:58,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 01:17:59,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=202933.33333333334, ans=0.125 2023-09-29 01:18:00,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:02,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 01:18:05,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:18:10,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:11,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:18:17,503 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.249e+02 2.602e+02 3.151e+02 5.214e+02, threshold=5.203e+02, percent-clipped=1.0 2023-09-29 01:18:17,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:17,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 01:18:19,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 01:18:23,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:23,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:26,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:18:26,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:18:27,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:18:29,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 01:18:29,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:18:30,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 01:18:30,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:30,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:32,220 INFO [train.py:1039] (0/4) Epoch 6, batch 3900, loss[loss=0.1925, simple_loss=0.2659, pruned_loss=0.05957, over 24382.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2967, pruned_loss=0.08545, over 4736887.86 frames. ], batch size: 56, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:18:33,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:18:33,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:36,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:18:36,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:36,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:38,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:18:38,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 01:18:38,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:44,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:44,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:44,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:18:45,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=203066.66666666666, ans=0.125 2023-09-29 01:18:45,448 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=203066.66666666666, ans=0.0 2023-09-29 01:18:46,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:49,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:49,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:51,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:18:52,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 01:18:52,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:18:54,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 01:18:54,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:55,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 01:18:58,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 01:19:02,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:02,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:19:04,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:19:04,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:08,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:08,943 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=203200.0, ans=0.125 2023-09-29 01:19:10,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=203200.0, ans=0.0 2023-09-29 01:19:12,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:19:13,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:19:13,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:13,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:19:21,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:19:21,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:19:30,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:19:33,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:19:41,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:19:44,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:44,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 01:19:46,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 01:19:46,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:46,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 01:19:46,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=203333.33333333334, ans=0.1 2023-09-29 01:19:50,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:50,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 01:19:50,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=203333.33333333334, ans=0.1 2023-09-29 01:19:55,273 INFO [train.py:1039] (0/4) Epoch 6, batch 3950, loss[loss=0.252, simple_loss=0.3019, pruned_loss=0.101, over 23796.00 frames. ], tot_loss[loss=0.2335, simple_loss=0.2969, pruned_loss=0.08508, over 4738033.82 frames. ], batch size: 179, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:19:59,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:20:01,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 01:20:01,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:20:05,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:20:07,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:20:13,268 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 01:20:13,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:14,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 01:20:14,865 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 01:20:14,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:20:17,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:17,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:20:17,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:19,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=203466.66666666666, ans=0.09899494936611666 2023-09-29 01:20:20,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 01:20:24,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:20:24,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:24,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:20:24,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:20:26,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:20:30,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=203533.33333333334, ans=0.125 2023-09-29 01:20:36,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:20:38,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:20:42,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 01:20:47,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 01:20:47,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 01:20:47,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:20:49,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:20:59,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:20:59,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:21:00,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:00,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:21:00,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 01:21:02,045 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.135e+02 2.350e+02 2.654e+02 4.554e+02, threshold=4.701e+02, percent-clipped=0.0 2023-09-29 01:21:06,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:21:08,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:21:12,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 01:21:17,429 INFO [train.py:1039] (0/4) Epoch 6, batch 4000, loss[loss=0.2367, simple_loss=0.2993, pruned_loss=0.08705, over 23324.00 frames. ], tot_loss[loss=0.2337, simple_loss=0.2971, pruned_loss=0.08511, over 4742867.81 frames. ], batch size: 134, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:21:22,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:32,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:37,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:37,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:21:37,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:37,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 01:21:38,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:21:40,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 01:21:40,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:21:40,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 01:21:41,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:47,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:21:47,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:21:47,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:21:47,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:47,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:21:49,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:21:52,331 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 01:21:53,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:21:53,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:21:56,950 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 01:21:57,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:21:57,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:04,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 01:22:06,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:22:07,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:22:09,281 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 01:22:10,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:22:12,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 01:22:12,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:22:13,178 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.12 vs. limit=22.5 2023-09-29 01:22:13,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:14,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:22:15,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:22:15,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:22:16,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:19,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 01:22:19,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:22,544 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 01:22:27,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:22:30,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:22:32,365 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:22:34,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:22:35,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:35,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:22:36,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:22:39,600 INFO [train.py:1039] (0/4) Epoch 6, batch 4050, loss[loss=0.2325, simple_loss=0.2932, pruned_loss=0.08593, over 23238.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.2975, pruned_loss=0.08508, over 4739434.35 frames. ], batch size: 105, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:22:43,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=204066.66666666666, ans=0.125 2023-09-29 01:22:45,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:46,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:22:47,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 01:22:47,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=204066.66666666666, ans=0.1 2023-09-29 01:22:48,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=204066.66666666666, ans=0.125 2023-09-29 01:22:49,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:22:49,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:22:51,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:22:52,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:22:54,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:22:58,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:22:59,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:00,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:23:02,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:23:03,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:23:06,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=204133.33333333334, ans=0.0 2023-09-29 01:23:08,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:09,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:23:12,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 01:23:14,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 01:23:14,498 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 01:23:14,781 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=204200.0, ans=0.1 2023-09-29 01:23:14,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=204200.0, ans=0.1 2023-09-29 01:23:16,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:23:18,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=204200.0, ans=0.1 2023-09-29 01:23:21,031 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.97 vs. limit=6.0 2023-09-29 01:23:23,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 01:23:24,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:29,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:33,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:33,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:23:33,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:34,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:35,096 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:23:38,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 01:23:38,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:23:40,631 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.34 vs. limit=22.5 2023-09-29 01:23:41,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:43,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 01:23:48,266 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.126e+02 2.466e+02 2.913e+02 5.658e+02, threshold=4.933e+02, percent-clipped=1.0 2023-09-29 01:23:48,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:56,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 01:23:56,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:56,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:23:59,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 01:23:59,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 01:23:59,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:02,554 INFO [train.py:1039] (0/4) Epoch 6, batch 4100, loss[loss=0.2281, simple_loss=0.307, pruned_loss=0.0746, over 24443.00 frames. ], tot_loss[loss=0.2347, simple_loss=0.2979, pruned_loss=0.08576, over 4736003.29 frames. ], batch size: 69, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:24:02,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:04,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:04,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:24:12,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 01:24:14,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 01:24:16,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 01:24:17,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 01:24:17,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:17,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:17,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:19,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:24:19,572 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 01:24:23,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:23,400 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:24:24,168 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.18 vs. limit=15.0 2023-09-29 01:24:24,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:24:24,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:24,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:24:29,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:24:31,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:31,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:24:31,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 01:24:31,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:32,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:24:32,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:32,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:24:33,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 01:24:34,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:24:36,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 01:24:37,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:24:41,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:41,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 01:24:45,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:45,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:24:45,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:24:46,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 01:24:47,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:24:48,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:24:50,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 01:24:51,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:51,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:24:53,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:00,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:03,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=204600.0, ans=0.0 2023-09-29 01:25:05,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:06,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:25:15,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:15,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:21,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:24,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:25:25,896 INFO [train.py:1039] (0/4) Epoch 6, batch 4150, loss[loss=0.1962, simple_loss=0.2702, pruned_loss=0.0611, over 24330.00 frames. ], tot_loss[loss=0.2363, simple_loss=0.2987, pruned_loss=0.08696, over 4719161.09 frames. ], batch size: 61, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:25:27,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:25:28,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=204733.33333333334, ans=0.125 2023-09-29 01:25:29,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:25:29,960 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.05 vs. limit=15.0 2023-09-29 01:25:30,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:25:30,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:34,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 01:25:34,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:35,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 01:25:36,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.73 vs. limit=15.0 2023-09-29 01:25:37,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 01:25:37,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 01:25:39,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:44,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:25:44,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:48,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:25:49,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:25:50,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:25:52,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:25:52,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:54,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:25:58,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:02,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:02,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 01:26:03,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=204866.66666666666, ans=0.125 2023-09-29 01:26:07,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 01:26:07,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:26:07,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 01:26:07,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:26:07,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:09,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:11,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:17,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 01:26:21,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:23,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:26:24,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 01:26:24,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:26,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 01:26:29,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:26:30,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:31,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:33,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 01:26:33,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:26:33,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:26:33,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:26:33,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=205000.0, ans=0.0 2023-09-29 01:26:34,518 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.193e+02 2.474e+02 2.867e+02 4.434e+02, threshold=4.949e+02, percent-clipped=0.0 2023-09-29 01:26:36,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 01:26:36,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:36,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:26:36,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:26:37,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 01:26:37,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:37,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:26:39,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:42,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:42,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 01:26:42,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:49,014 INFO [train.py:1039] (0/4) Epoch 6, batch 4200, loss[loss=0.2577, simple_loss=0.3073, pruned_loss=0.1041, over 23368.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.297, pruned_loss=0.08642, over 4716173.07 frames. ], batch size: 105, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:26:49,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:26:50,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 01:26:54,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:26:56,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:26:57,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:26:57,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:57,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:58,763 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-09-29 01:26:59,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 01:27:04,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 01:27:04,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:07,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:09,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:27:12,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:27:13,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:13,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:15,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 01:27:15,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:17,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:17,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:27:17,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:27:19,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:27:20,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=205133.33333333334, ans=0.125 2023-09-29 01:27:21,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 01:27:22,263 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.03 vs. limit=5.0 2023-09-29 01:27:22,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:27,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:27:29,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:27:32,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:27:33,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:27:36,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:27:36,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 01:27:36,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:27:36,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:27:42,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:27:45,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:52,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:27:52,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=205266.66666666666, ans=0.0 2023-09-29 01:27:55,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 01:27:57,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:28:00,956 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=205333.33333333334, ans=0.125 2023-09-29 01:28:02,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:28:04,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:05,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 01:28:11,988 INFO [train.py:1039] (0/4) Epoch 6, batch 4250, loss[loss=0.2178, simple_loss=0.2689, pruned_loss=0.0833, over 22761.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2955, pruned_loss=0.08603, over 4707987.90 frames. ], batch size: 322, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:28:12,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:28:16,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:28:16,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:28:18,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:23,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:28:23,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 01:28:23,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:28:27,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:30,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:35,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:35,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=205466.66666666666, ans=0.0 2023-09-29 01:28:37,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:38,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:28:38,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:28:40,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:42,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:42,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:44,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:28:45,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:28:47,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 01:28:51,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 01:28:51,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:53,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:53,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:53,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:28:53,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:55,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:58,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:28:58,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:29:03,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:05,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:07,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 01:29:07,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:29:08,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 01:29:09,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=205600.0, ans=0.0 2023-09-29 01:29:10,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:29:12,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:29:13,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:13,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:29:16,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 01:29:17,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:29:19,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:29:21,778 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.147e+02 2.416e+02 2.924e+02 5.280e+02, threshold=4.831e+02, percent-clipped=2.0 2023-09-29 01:29:22,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:25,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:26,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:29:26,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:29,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:30,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:29:33,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:29:33,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 01:29:35,411 INFO [train.py:1039] (0/4) Epoch 6, batch 4300, loss[loss=0.2191, simple_loss=0.2892, pruned_loss=0.07447, over 24443.00 frames. ], tot_loss[loss=0.2336, simple_loss=0.2949, pruned_loss=0.0862, over 4693720.15 frames. ], batch size: 63, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:29:35,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:41,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:41,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:29:45,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:55,793 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.44 vs. limit=10.0 2023-09-29 01:29:56,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:56,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 01:29:56,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:29:59,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:29:59,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:29:59,398 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 01:30:02,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:30:06,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:09,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 01:30:09,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:30:09,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 01:30:12,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:30:14,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:30:17,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:30:17,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:30:17,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:30:19,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:19,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:30:21,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 01:30:21,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 01:30:24,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:30:28,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:30:28,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:28,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 01:30:28,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 01:30:28,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 01:30:30,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:30:31,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 01:30:31,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 01:30:33,326 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.76 vs. limit=15.0 2023-09-29 01:30:35,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:35,927 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 01:30:37,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:30:39,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:39,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:41,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 01:30:43,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:43,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:44,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:30:44,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:44,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:30:46,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:30:49,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:50,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:52,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:57,154 INFO [train.py:1039] (0/4) Epoch 6, batch 4350, loss[loss=0.2484, simple_loss=0.3167, pruned_loss=0.09008, over 24658.00 frames. ], tot_loss[loss=0.2335, simple_loss=0.2955, pruned_loss=0.08572, over 4715196.84 frames. ], batch size: 68, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:30:57,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=206066.66666666666, ans=0.07 2023-09-29 01:30:58,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 01:30:58,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:30:59,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=206066.66666666666, ans=0.125 2023-09-29 01:31:04,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:07,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:09,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=206066.66666666666, ans=0.125 2023-09-29 01:31:11,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:31:11,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:31:13,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=206133.33333333334, ans=0.025 2023-09-29 01:31:17,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:31:20,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:23,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:31:23,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:31:26,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=206133.33333333334, ans=0.125 2023-09-29 01:31:27,077 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.68 vs. limit=6.0 2023-09-29 01:31:27,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:31:28,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=206200.0, ans=0.5 2023-09-29 01:31:30,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:31:32,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:31:37,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 01:31:37,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:39,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:43,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:44,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 01:31:49,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:31:49,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:31:54,179 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.08 vs. limit=22.5 2023-09-29 01:31:54,671 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 01:31:56,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:31:56,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:31:57,753 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 01:31:58,350 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.80 vs. limit=15.0 2023-09-29 01:31:59,218 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 01:31:59,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:00,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:02,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:32:03,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:04,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:04,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:06,019 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.197e+02 2.443e+02 2.898e+02 4.711e+02, threshold=4.887e+02, percent-clipped=0.0 2023-09-29 01:32:07,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 01:32:07,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:07,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 01:32:09,430 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 01:32:09,437 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 01:32:09,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 01:32:12,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:32:12,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:32:13,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:14,050 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=206333.33333333334, ans=0.125 2023-09-29 01:32:15,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:32:16,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 01:32:18,821 INFO [train.py:1039] (0/4) Epoch 6, batch 4400, loss[loss=0.243, simple_loss=0.2985, pruned_loss=0.09379, over 23532.00 frames. ], tot_loss[loss=0.2351, simple_loss=0.297, pruned_loss=0.08657, over 4698721.38 frames. ], batch size: 134, lr: 1.65e-02, grad_scale: 32.0 2023-09-29 01:32:19,011 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 01:32:19,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:22,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=206400.0, ans=0.0 2023-09-29 01:32:23,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:23,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:25,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:26,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=206400.0, ans=0.125 2023-09-29 01:32:27,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 01:32:28,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 01:32:28,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 01:32:28,897 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 01:32:30,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:32:30,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:32,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 01:32:33,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:35,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:35,303 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 01:32:38,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:38,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 01:32:40,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 01:32:43,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 01:32:43,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 01:32:43,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 01:32:43,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:45,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:45,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:46,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:47,141 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:32:50,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 01:32:50,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 01:32:51,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:53,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:32:53,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:54,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:54,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:54,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 01:32:57,904 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 01:33:00,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:01,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=206533.33333333334, ans=0.95 2023-09-29 01:33:02,506 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:33:06,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:33:10,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 01:33:14,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:33:19,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:20,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:33:22,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 01:33:22,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:33:22,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:33:22,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:33:22,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:33:29,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 01:33:33,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 01:33:34,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 01:33:34,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:33:34,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 01:33:36,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:33:40,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:33:41,387 INFO [train.py:1039] (0/4) Epoch 6, batch 4450, loss[loss=0.1744, simple_loss=0.2465, pruned_loss=0.05113, over 21660.00 frames. ], tot_loss[loss=0.2343, simple_loss=0.2966, pruned_loss=0.086, over 4717336.08 frames. ], batch size: 47, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:33:41,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 01:33:44,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:48,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:49,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:33:54,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:33:54,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:33:59,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:00,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:34:04,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:34:04,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=206800.0, ans=0.125 2023-09-29 01:34:05,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:05,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 01:34:05,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:07,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:07,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:07,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:34:11,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:34:16,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:17,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:19,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:20,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:20,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:34:26,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:34:26,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 01:34:26,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 01:34:26,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:34:28,424 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.25 vs. limit=15.0 2023-09-29 01:34:29,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:29,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 01:34:32,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:34:36,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:37,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 01:34:37,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:37,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:37,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:34:37,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:40,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:42,080 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=206933.33333333334, ans=0.125 2023-09-29 01:34:45,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:34:46,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 01:34:48,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:34:49,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:49,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:51,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=207000.0, ans=0.05 2023-09-29 01:34:52,741 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 2.158e+02 2.416e+02 2.828e+02 3.801e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 01:34:52,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:52,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:34:54,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:34:58,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 01:35:01,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:35:04,023 INFO [train.py:1039] (0/4) Epoch 6, batch 4500, loss[loss=0.2463, simple_loss=0.3034, pruned_loss=0.09458, over 23325.00 frames. ], tot_loss[loss=0.2363, simple_loss=0.298, pruned_loss=0.08726, over 4705239.31 frames. ], batch size: 105, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:35:05,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:06,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=207066.66666666666, ans=0.0 2023-09-29 01:35:07,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 01:35:07,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 01:35:08,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:16,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:35:16,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:16,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:35:18,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:35:18,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:18,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:32,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:34,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:35:36,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:35:36,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:35:37,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:35:43,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:35:47,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:35:51,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:35:54,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:35:54,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 01:35:57,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:35:57,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:59,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:59,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:36:01,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:36:01,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 01:36:01,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:36:01,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:06,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:36:06,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:36:09,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:11,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:36:11,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:36:14,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 01:36:16,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 01:36:16,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 01:36:16,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=207333.33333333334, ans=0.0 2023-09-29 01:36:20,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 01:36:21,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=207333.33333333334, ans=0.0 2023-09-29 01:36:25,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 01:36:26,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:27,962 INFO [train.py:1039] (0/4) Epoch 6, batch 4550, loss[loss=0.238, simple_loss=0.3111, pruned_loss=0.08247, over 24545.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.2961, pruned_loss=0.08688, over 4697176.51 frames. ], batch size: 71, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:36:29,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:29,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:34,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:35,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=207400.0, ans=0.125 2023-09-29 01:36:40,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:36:42,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:36:43,407 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.76 vs. limit=22.5 2023-09-29 01:36:45,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:36:45,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:36:45,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:47,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:47,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:52,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:36:55,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 01:36:55,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 01:36:57,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:36:58,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 01:37:02,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 01:37:02,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:05,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 01:37:08,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:37:10,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:37:14,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 01:37:17,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:18,154 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.22 vs. limit=15.0 2023-09-29 01:37:20,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:20,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:21,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:23,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 01:37:24,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 01:37:25,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:37:25,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 01:37:26,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 01:37:26,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:27,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=207600.0, ans=0.0 2023-09-29 01:37:28,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:28,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:31,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:31,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:37:33,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:37:33,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 01:37:37,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:37,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:37:37,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 01:37:37,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:37:37,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 01:37:37,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=207666.66666666666, ans=0.0 2023-09-29 01:37:39,090 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 2.083e+02 2.307e+02 2.767e+02 3.692e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 01:37:42,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:37:42,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:37:44,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:37:46,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:46,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:37:46,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=207666.66666666666, ans=0.2 2023-09-29 01:37:47,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:37:50,410 INFO [train.py:1039] (0/4) Epoch 6, batch 4600, loss[loss=0.2286, simple_loss=0.2806, pruned_loss=0.08837, over 23427.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2951, pruned_loss=0.0863, over 4712108.43 frames. ], batch size: 285, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:37:50,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:37:52,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:53,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:56,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:37:56,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:37:58,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:37:59,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 01:38:00,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:38:04,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=207733.33333333334, ans=0.125 2023-09-29 01:38:05,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:38:05,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:08,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:17,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 01:38:18,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:20,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:22,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:38:22,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:22,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=207866.66666666666, ans=0.125 2023-09-29 01:38:28,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 01:38:28,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:38:28,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:38:35,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:35,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:38:36,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:38:38,427 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=207933.33333333334, ans=0.1 2023-09-29 01:38:41,986 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=207933.33333333334, ans=0.125 2023-09-29 01:38:44,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 01:38:44,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:38:46,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=207933.33333333334, ans=15.0 2023-09-29 01:38:47,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=207933.33333333334, ans=0.125 2023-09-29 01:38:48,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:50,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:38:53,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:53,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 01:38:53,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:55,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 01:38:55,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:55,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:38:58,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:58,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:00,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:00,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 01:39:00,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 01:39:00,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 01:39:00,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:02,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:02,743 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.32 vs. limit=15.0 2023-09-29 01:39:03,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:05,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:13,056 INFO [train.py:1039] (0/4) Epoch 6, batch 4650, loss[loss=0.2013, simple_loss=0.2743, pruned_loss=0.06413, over 24459.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2952, pruned_loss=0.08621, over 4714690.10 frames. ], batch size: 58, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:39:16,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:39:18,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:20,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:20,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:39:21,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:21,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:21,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:26,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 01:39:30,456 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.87 vs. limit=22.5 2023-09-29 01:39:31,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:39:32,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 01:39:32,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:32,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 01:39:34,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:39:34,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 01:39:34,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 01:39:34,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:36,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:39:39,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:39:40,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:40,684 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 01:39:42,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:43,256 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.03 vs. limit=10.0 2023-09-29 01:39:44,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 01:39:47,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:47,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:39:47,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 01:39:50,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:53,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=208200.0, ans=0.125 2023-09-29 01:39:55,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:39:58,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:04,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:05,151 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=208266.66666666666, ans=0.125 2023-09-29 01:40:06,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:08,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:08,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:40:09,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=208266.66666666666, ans=0.1 2023-09-29 01:40:11,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 01:40:12,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 01:40:12,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 01:40:12,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 01:40:15,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:23,229 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 2.129e+02 2.354e+02 2.649e+02 3.887e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 01:40:23,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:40:23,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:24,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 01:40:25,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:27,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:27,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:40:29,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:40:32,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:40:32,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:34,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:36,314 INFO [train.py:1039] (0/4) Epoch 6, batch 4700, loss[loss=0.2223, simple_loss=0.2853, pruned_loss=0.07961, over 20131.00 frames. ], tot_loss[loss=0.2335, simple_loss=0.2953, pruned_loss=0.08581, over 4714140.01 frames. ], batch size: 44, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:40:36,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:37,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:40:37,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:40:38,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 01:40:39,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:40:41,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 01:40:49,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:49,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:50,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:40:52,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:53,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:40:56,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 01:40:58,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 01:41:02,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:02,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:41:02,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:41:06,822 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.18 vs. limit=12.0 2023-09-29 01:41:07,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:14,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:41:15,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:41:18,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:41:25,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 01:41:26,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:41:27,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:30,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 01:41:34,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:41:36,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=208600.0, ans=0.0 2023-09-29 01:41:40,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:41:40,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 01:41:40,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=208666.66666666666, ans=0.125 2023-09-29 01:41:43,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:43,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:46,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:47,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:41:47,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 01:41:48,646 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 01:41:50,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:53,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 01:41:54,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:57,778 INFO [train.py:1039] (0/4) Epoch 6, batch 4750, loss[loss=0.2162, simple_loss=0.2871, pruned_loss=0.07267, over 24464.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.2954, pruned_loss=0.08565, over 4716859.21 frames. ], batch size: 63, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:41:58,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 01:41:59,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:42:01,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:04,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:06,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:42:07,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 01:42:07,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:11,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 01:42:11,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=208733.33333333334, ans=0.0 2023-09-29 01:42:13,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:42:15,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:42:15,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:20,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 01:42:22,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=208800.0, ans=0.125 2023-09-29 01:42:24,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:42:26,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 01:42:26,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:31,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:31,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=208866.66666666666, ans=0.1 2023-09-29 01:42:32,713 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 01:42:32,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 01:42:35,255 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-09-29 01:42:38,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 01:42:40,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:42,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:42:46,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:42:46,608 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 01:42:46,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:42:46,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=208933.33333333334, ans=0.125 2023-09-29 01:42:50,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:42:52,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=208933.33333333334, ans=0.125 2023-09-29 01:42:54,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:42:54,441 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=208933.33333333334, ans=0.125 2023-09-29 01:42:55,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 01:42:55,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 01:42:57,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:57,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:42:57,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:58,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:42:58,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 01:43:00,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 01:43:03,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:06,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:43:06,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 01:43:06,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:09,480 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.147e+02 2.364e+02 2.785e+02 5.281e+02, threshold=4.728e+02, percent-clipped=1.0 2023-09-29 01:43:09,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:09,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:43:11,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:11,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:43:11,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=209000.0, ans=0.125 2023-09-29 01:43:15,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:15,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 01:43:16,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 01:43:17,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 01:43:21,131 INFO [train.py:1039] (0/4) Epoch 6, batch 4800, loss[loss=0.2183, simple_loss=0.2888, pruned_loss=0.07393, over 24460.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2967, pruned_loss=0.08619, over 4722869.21 frames. ], batch size: 63, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:43:21,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:43:21,444 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=209066.66666666666, ans=0.0 2023-09-29 01:43:23,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:24,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 01:43:26,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=209066.66666666666, ans=0.125 2023-09-29 01:43:30,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:30,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:34,011 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.49 vs. limit=15.0 2023-09-29 01:43:36,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:43:37,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:37,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:39,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 01:43:40,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:40,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:43:42,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:43:48,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:43:49,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:50,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:43:52,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:52,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:43:52,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:53,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:56,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:58,309 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=209200.0, ans=0.0 2023-09-29 01:43:59,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:44:04,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:44:04,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:06,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=209200.0, ans=0.125 2023-09-29 01:44:07,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 01:44:07,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 01:44:07,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:07,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:44:08,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=209266.66666666666, ans=0.125 2023-09-29 01:44:09,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:44:09,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:09,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:44:11,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:44:11,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:14,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:14,506 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=209266.66666666666, ans=0.125 2023-09-29 01:44:17,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:18,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:19,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=209266.66666666666, ans=0.0 2023-09-29 01:44:19,772 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.61 vs. limit=15.0 2023-09-29 01:44:23,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 01:44:25,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:25,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:25,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:44:26,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:26,959 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=209333.33333333334, ans=0.2 2023-09-29 01:44:30,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:31,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=209333.33333333334, ans=0.125 2023-09-29 01:44:32,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:44:32,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:44:34,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:44:34,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:44:37,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:39,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:39,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:40,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 01:44:42,324 INFO [train.py:1039] (0/4) Epoch 6, batch 4850, loss[loss=0.3259, simple_loss=0.358, pruned_loss=0.1468, over 19698.00 frames. ], tot_loss[loss=0.235, simple_loss=0.2972, pruned_loss=0.08637, over 4728488.97 frames. ], batch size: 388, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:44:42,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 01:44:42,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:44:42,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:45,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:51,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 01:44:53,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:56,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:44:59,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:44:59,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:02,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:45:05,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:45:06,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:45:06,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 01:45:12,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:45:14,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:45:14,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:45:15,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:45:15,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 01:45:18,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:45:19,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:23,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:24,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 01:45:24,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 01:45:26,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=209533.33333333334, ans=0.07 2023-09-29 01:45:27,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:45:33,112 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=209600.0, ans=0.0 2023-09-29 01:45:34,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:45:34,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 01:45:36,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:45:36,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:45:39,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:45:42,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 01:45:42,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:42,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 01:45:44,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:44,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:45:45,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 01:45:49,440 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=209666.66666666666, ans=0.1 2023-09-29 01:45:53,386 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.040e+02 2.308e+02 2.752e+02 3.700e+02, threshold=4.617e+02, percent-clipped=0.0 2023-09-29 01:45:55,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:46:00,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:46:00,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:04,396 INFO [train.py:1039] (0/4) Epoch 6, batch 4900, loss[loss=0.2422, simple_loss=0.3095, pruned_loss=0.08744, over 24487.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2964, pruned_loss=0.08632, over 4724403.52 frames. ], batch size: 66, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:46:06,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 01:46:06,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:46:11,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:13,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:13,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:46:17,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 01:46:23,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 01:46:26,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 01:46:28,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 01:46:28,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:29,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:29,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:46:29,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:29,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:46:29,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 01:46:33,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 01:46:33,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:46:34,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:46:36,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:39,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:46:39,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:42,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:46:42,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 01:46:44,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:46:44,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:44,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 01:46:44,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 01:46:51,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 01:46:52,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=209866.66666666666, ans=0.0 2023-09-29 01:46:53,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:46:54,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:46:54,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:46:56,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:56,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:46:56,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:46:56,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 01:46:59,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:01,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:47:02,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:47:05,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 01:47:07,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:47:07,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 01:47:07,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 01:47:14,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:15,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:47:16,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=210000.0, ans=0.125 2023-09-29 01:47:18,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 01:47:18,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:18,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:47:20,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:26,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:26,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:47:26,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:26,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:47:28,397 INFO [train.py:1039] (0/4) Epoch 6, batch 4950, loss[loss=0.2462, simple_loss=0.3068, pruned_loss=0.09281, over 23266.00 frames. ], tot_loss[loss=0.2329, simple_loss=0.2943, pruned_loss=0.08573, over 4704050.19 frames. ], batch size: 93, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:47:28,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:47:31,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:31,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:36,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 01:47:36,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 01:47:37,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:47:37,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 01:47:37,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:39,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:47:39,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:47:39,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:47:41,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:42,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:47:44,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:47:44,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:46,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:46,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:51,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:47:57,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:59,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:48:02,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:03,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:03,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:48:05,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 01:48:05,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 01:48:08,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:10,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:48:10,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:48:11,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:48:11,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:48:13,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:48:14,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:17,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:48:20,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:48:24,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:24,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:25,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 01:48:25,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:48:27,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:48:31,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:48:32,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:48:32,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:48:35,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:35,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:48:36,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:48:38,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:48:38,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:48:39,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:41,012 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.165e+02 2.494e+02 2.935e+02 4.100e+02, threshold=4.988e+02, percent-clipped=0.0 2023-09-29 01:48:41,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 01:48:44,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:48:49,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 01:48:49,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:48:50,378 INFO [train.py:1039] (0/4) Epoch 6, batch 5000, loss[loss=0.2409, simple_loss=0.2895, pruned_loss=0.09613, over 23783.00 frames. ], tot_loss[loss=0.2315, simple_loss=0.2935, pruned_loss=0.08476, over 4691654.35 frames. ], batch size: 164, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:48:55,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=210400.0, ans=0.07 2023-09-29 01:48:57,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:57,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:48:58,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 01:49:00,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 01:49:01,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:04,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 01:49:06,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:49:06,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:49:08,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 01:49:08,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:08,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:08,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 01:49:08,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:08,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:11,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 01:49:12,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 01:49:14,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:49:14,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 01:49:14,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:49:14,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:14,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=210466.66666666666, ans=0.0 2023-09-29 01:49:15,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:49:15,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 01:49:15,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 01:49:17,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=210466.66666666666, ans=0.125 2023-09-29 01:49:18,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 01:49:18,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:18,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:20,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 01:49:20,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:49:21,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:23,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:23,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 01:49:23,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=210533.33333333334, ans=0.125 2023-09-29 01:49:24,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 01:49:25,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=210533.33333333334, ans=0.2 2023-09-29 01:49:26,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:49:28,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:49:31,880 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 01:49:35,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:36,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:36,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:49:41,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 01:49:41,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:41,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:42,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:49:45,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:49:45,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:48,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:50,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:56,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 01:50:00,078 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=210666.66666666666, ans=0.125 2023-09-29 01:50:01,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:09,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:50:11,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:11,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:50:11,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:13,239 INFO [train.py:1039] (0/4) Epoch 6, batch 5050, loss[loss=0.2306, simple_loss=0.2866, pruned_loss=0.08734, over 23715.00 frames. ], tot_loss[loss=0.2327, simple_loss=0.2941, pruned_loss=0.08559, over 4681221.36 frames. ], batch size: 149, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:50:13,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:50:13,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:50:13,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:14,022 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.63 vs. limit=15.0 2023-09-29 01:50:18,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:18,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 01:50:20,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:50:21,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:21,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:50:21,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=210733.33333333334, ans=0.125 2023-09-29 01:50:23,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 01:50:23,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:23,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:50:26,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:50:27,201 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.49 vs. limit=15.0 2023-09-29 01:50:27,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:50:29,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:50:30,128 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.71 vs. limit=10.0 2023-09-29 01:50:40,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 01:50:42,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:50:42,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:50:42,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 01:50:42,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:50:43,121 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-09-29 01:50:43,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:44,206 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=210866.66666666666, ans=0.1 2023-09-29 01:50:45,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:46,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:50:46,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 01:50:46,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 01:50:49,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:52,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:50:54,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:54,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 01:50:55,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:50:58,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 01:51:00,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:51:00,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:51:02,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:03,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:51:05,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:08,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:51:08,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:09,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:51:09,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:51:09,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 01:51:11,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:51:13,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:51:15,309 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=210933.33333333334, ans=0.0 2023-09-29 01:51:17,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:51:17,895 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 01:51:17,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:51:19,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:19,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:21,346 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 01:51:24,334 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.235e+02 2.528e+02 3.059e+02 5.158e+02, threshold=5.056e+02, percent-clipped=2.0 2023-09-29 01:51:24,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:24,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 01:51:24,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:28,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:29,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:29,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 01:51:31,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 01:51:32,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:32,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:51:33,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:51:34,397 INFO [train.py:1039] (0/4) Epoch 6, batch 5100, loss[loss=0.2336, simple_loss=0.2932, pruned_loss=0.08701, over 23754.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2957, pruned_loss=0.0863, over 4681321.09 frames. ], batch size: 164, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:51:36,167 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 01:51:39,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:43,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 01:51:43,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 01:51:44,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:46,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:50,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:50,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 01:51:50,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 01:51:50,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=211133.33333333334, ans=0.125 2023-09-29 01:51:54,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:56,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:52:00,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:52:04,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 01:52:05,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:06,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:52:06,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:52:09,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 01:52:10,188 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:52:11,384 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 01:52:12,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:12,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 01:52:14,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 01:52:18,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:30,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:52:31,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 01:52:31,866 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 01:52:33,956 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 01:52:36,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 01:52:36,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:39,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 01:52:40,619 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.48 vs. limit=15.0 2023-09-29 01:52:43,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 01:52:44,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:52:47,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:52:49,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 01:52:49,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:52:50,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 01:52:55,307 INFO [train.py:1039] (0/4) Epoch 6, batch 5150, loss[loss=0.2261, simple_loss=0.2942, pruned_loss=0.07899, over 24664.00 frames. ], tot_loss[loss=0.2344, simple_loss=0.2963, pruned_loss=0.08629, over 4686938.36 frames. ], batch size: 65, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:52:55,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:52:55,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:52:55,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:52:55,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=211400.0, ans=0.1 2023-09-29 01:52:57,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:52:59,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:52:59,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:53:00,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 01:53:00,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 01:53:00,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 01:53:00,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:53:00,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 01:53:03,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:03,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 01:53:05,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:06,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:12,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:53:12,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 01:53:14,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:15,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:53:17,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:53:17,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:17,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:17,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:53:17,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:53:18,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 01:53:20,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:53:20,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:53:23,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:53:25,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 01:53:27,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:53:32,779 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.28 vs. limit=15.0 2023-09-29 01:53:33,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:53:35,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 01:53:39,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:46,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:49,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:52,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:53:52,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:53:55,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 01:53:57,469 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=211600.0, ans=0.0 2023-09-29 01:53:59,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:59,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:53:59,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:54:04,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:05,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:54:05,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 01:54:08,361 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.054e+02 2.300e+02 2.668e+02 5.365e+02, threshold=4.600e+02, percent-clipped=1.0 2023-09-29 01:54:11,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:13,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:54:14,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:54:14,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:54:14,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:54:15,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=211666.66666666666, ans=0.0 2023-09-29 01:54:16,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:54:16,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:54:16,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:54:17,600 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.97 vs. limit=15.0 2023-09-29 01:54:18,915 INFO [train.py:1039] (0/4) Epoch 6, batch 5200, loss[loss=0.1926, simple_loss=0.2637, pruned_loss=0.06077, over 24447.00 frames. ], tot_loss[loss=0.236, simple_loss=0.2978, pruned_loss=0.08708, over 4685543.25 frames. ], batch size: 58, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:54:20,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=211733.33333333334, ans=0.0 2023-09-29 01:54:22,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:54:24,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:54:27,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:28,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 01:54:30,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:54:30,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:32,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:32,865 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.28 vs. limit=15.0 2023-09-29 01:54:34,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:54:34,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:37,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 01:54:39,253 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=211800.0, ans=0.1 2023-09-29 01:54:40,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:54:41,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:43,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 01:54:46,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:54:47,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:54:48,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 01:54:48,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 01:54:51,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 01:54:51,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:51,887 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 01:54:51,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:56,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:56,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:54:57,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 01:54:57,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:54:59,709 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.24 vs. limit=15.0 2023-09-29 01:55:00,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=211866.66666666666, ans=0.125 2023-09-29 01:55:01,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:03,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 01:55:05,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 01:55:05,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 01:55:10,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 01:55:11,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:55:16,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:55:16,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:17,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 01:55:19,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:19,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:55:19,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:19,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:55:24,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:24,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:55:30,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:55:32,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:32,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:32,993 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.30 vs. limit=15.0 2023-09-29 01:55:36,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:38,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 01:55:39,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:39,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:55:41,377 INFO [train.py:1039] (0/4) Epoch 6, batch 5250, loss[loss=0.2612, simple_loss=0.3319, pruned_loss=0.09529, over 24002.00 frames. ], tot_loss[loss=0.2354, simple_loss=0.2974, pruned_loss=0.08665, over 4683722.17 frames. ], batch size: 86, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:55:41,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:41,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:55:41,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:55:45,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:55:47,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=212066.66666666666, ans=0.125 2023-09-29 01:55:48,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:48,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:55:49,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:55:56,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:57,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:56:00,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:56:01,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:56:05,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 01:56:05,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:56:06,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:56:16,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=212200.0, ans=0.2 2023-09-29 01:56:19,646 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.47 vs. limit=22.5 2023-09-29 01:56:23,892 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:56:44,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=212333.33333333334, ans=0.125 2023-09-29 01:56:46,952 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.177e+02 2.529e+02 3.121e+02 4.794e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 01:56:47,266 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:56:55,333 INFO [train.py:1039] (0/4) Epoch 6, batch 5300, loss[loss=0.2522, simple_loss=0.3032, pruned_loss=0.1006, over 23710.00 frames. ], tot_loss[loss=0.2337, simple_loss=0.2955, pruned_loss=0.08597, over 4672229.39 frames. ], batch size: 179, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:57:11,114 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-6.pt 2023-09-29 01:57:19,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:57:19,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 01:57:19,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 01:57:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:19,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:19,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:19,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:19,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:20,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:20,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:20,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:57:20,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:57:20,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 01:57:20,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 01:57:20,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 01:57:21,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:57:21,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 01:57:21,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 01:57:21,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:22,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:22,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:22,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:22,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:57:23,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:23,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:23,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:23,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:23,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:23,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:57:23,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:23,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:57:24,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 01:57:24,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:25,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:25,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 01:57:25,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 01:57:25,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:57:25,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:25,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 01:57:25,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 01:57:25,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:26,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:57:26,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:27,101 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 01:57:27,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 01:57:27,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:57:27,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:27,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 01:57:27,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 01:57:27,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 01:57:27,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:30,914 INFO [train.py:1039] (0/4) Epoch 7, batch 0, loss[loss=0.214, simple_loss=0.284, pruned_loss=0.07198, over 24325.00 frames. ], tot_loss[loss=0.214, simple_loss=0.284, pruned_loss=0.07198, over 24325.00 frames. ], batch size: 61, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:57:30,915 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 01:57:45,837 INFO [train.py:1071] (0/4) Epoch 7, validation: loss=0.2938, simple_loss=0.3001, pruned_loss=0.1437, over 1125622.00 frames. 2023-09-29 01:57:45,838 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 01:57:47,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 01:57:48,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:57:51,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:57:56,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:56,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:57:58,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:58,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 01:58:00,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 01:58:03,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:03,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:05,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=212546.66666666666, ans=0.1 2023-09-29 01:58:07,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:07,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:09,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:58:09,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:10,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 01:58:12,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:22,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:58:22,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:24,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 01:58:27,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:58:29,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:58:29,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=212613.33333333334, ans=0.125 2023-09-29 01:58:30,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:35,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:58:40,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:40,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=212680.0, ans=0.0 2023-09-29 01:58:41,369 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.04 vs. limit=15.0 2023-09-29 01:58:46,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 01:58:51,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 01:58:51,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:58:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:51,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:58:53,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:54,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 01:58:56,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:57,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:59:01,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:01,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=212746.66666666666, ans=0.125 2023-09-29 01:59:04,538 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 01:59:05,959 INFO [train.py:1039] (0/4) Epoch 7, batch 50, loss[loss=0.2225, simple_loss=0.3002, pruned_loss=0.07243, over 24284.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.2977, pruned_loss=0.08455, over 1072934.76 frames. ], batch size: 74, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:59:06,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:59:08,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=212813.33333333334, ans=0.125 2023-09-29 01:59:09,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:11,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:11,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 01:59:11,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:59:12,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:59:15,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:16,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:20,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:22,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 01:59:22,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:31,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:59:31,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 01:59:32,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 01:59:36,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:59:36,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:59:38,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:38,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:59:39,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:59:39,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:59:39,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:48,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:59:49,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:49,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:59:51,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 01:59:54,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:59:54,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:59:54,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 01:59:55,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:58,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 01:59:58,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=213013.33333333334, ans=0.0 2023-09-29 02:00:01,120 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.205e+02 2.565e+02 2.922e+02 4.560e+02, threshold=5.129e+02, percent-clipped=0.0 2023-09-29 02:00:05,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=213013.33333333334, ans=0.125 2023-09-29 02:00:06,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:06,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:00:06,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:08,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:08,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:11,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 02:00:11,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 02:00:13,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:13,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:15,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:00:16,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:00:16,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 02:00:16,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 02:00:18,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 02:00:20,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:20,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:00:21,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 02:00:21,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 02:00:21,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:23,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:24,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:00:24,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:00:27,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:00:29,758 INFO [train.py:1039] (0/4) Epoch 7, batch 100, loss[loss=0.2328, simple_loss=0.2867, pruned_loss=0.08943, over 23820.00 frames. ], tot_loss[loss=0.2354, simple_loss=0.2995, pruned_loss=0.08561, over 1878558.93 frames. ], batch size: 195, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:00:34,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:00:36,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:38,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 02:00:38,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:41,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:00:41,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:41,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:41,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:41,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:43,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 02:00:44,424 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.23 vs. limit=15.0 2023-09-29 02:00:46,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:00:46,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:46,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:46,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:51,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 02:00:52,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:52,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:53,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=213213.33333333334, ans=0.0 2023-09-29 02:00:54,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:00:56,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:01:01,075 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 02:01:01,122 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 02:01:04,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:04,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:01:10,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:01:11,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:01:13,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:15,370 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-32000.pt 2023-09-29 02:01:21,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:22,063 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 02:01:25,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:01:27,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=213346.66666666666, ans=0.1 2023-09-29 02:01:28,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:01:30,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:01:33,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:34,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:38,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:38,878 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=213413.33333333334, ans=0.125 2023-09-29 02:01:40,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:01:43,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:45,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:47,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:47,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:01:47,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:47,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 02:01:47,572 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 02:01:47,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:49,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:01:49,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:49,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:50,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 02:01:50,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:01:50,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:01:50,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:50,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:52,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:52,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=213413.33333333334, ans=0.2 2023-09-29 02:01:53,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:01:54,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:01:57,192 INFO [train.py:1039] (0/4) Epoch 7, batch 150, loss[loss=0.3311, simple_loss=0.3619, pruned_loss=0.1501, over 19365.00 frames. ], tot_loss[loss=0.2331, simple_loss=0.2975, pruned_loss=0.08429, over 2513946.65 frames. ], batch size: 388, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:01:57,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:59,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:59,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:00,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:02,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:03,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:07,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:02:08,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:12,969 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.01 vs. limit=15.0 2023-09-29 02:02:13,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 02:02:13,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 02:02:13,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 02:02:15,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:02:15,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:02:17,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:02:19,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:02:19,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:19,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:21,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:22,587 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 02:02:24,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:27,503 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=213546.66666666666, ans=0.125 2023-09-29 02:02:30,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:33,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:02:37,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 02:02:39,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:02:41,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:41,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:02:42,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:02:45,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:45,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:02:46,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:48,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 02:02:48,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_ff2.min_abs, batch_count=213680.0, ans=0.1 2023-09-29 02:02:51,333 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 2.019e+02 2.375e+02 2.708e+02 4.033e+02, threshold=4.751e+02, percent-clipped=0.0 2023-09-29 02:02:55,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:55,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:02:55,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:02:55,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:02:58,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:59,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 02:03:01,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:03:03,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:03:04,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:06,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:03:06,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 02:03:06,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:03:06,114 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 02:03:09,860 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=213746.66666666666, ans=0.2 2023-09-29 02:03:12,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:15,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:03:17,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:03:19,249 INFO [train.py:1039] (0/4) Epoch 7, batch 200, loss[loss=0.2212, simple_loss=0.278, pruned_loss=0.0822, over 23545.00 frames. ], tot_loss[loss=0.235, simple_loss=0.2984, pruned_loss=0.08583, over 3002077.43 frames. ], batch size: 134, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:03:20,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 02:03:22,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:22,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=213813.33333333334, ans=0.125 2023-09-29 02:03:23,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:26,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 02:03:27,451 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.15 vs. limit=22.5 2023-09-29 02:03:28,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:03:30,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:32,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:03:34,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:03:34,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:34,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:53,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:03:53,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:03:55,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:03:55,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:03:57,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:03:57,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:03:58,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:00,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:04:02,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:02,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:03,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 02:04:03,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:04:05,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:07,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:04:09,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=214013.33333333334, ans=0.5 2023-09-29 02:04:15,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:17,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=214013.33333333334, ans=0.1 2023-09-29 02:04:20,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=214013.33333333334, ans=0.125 2023-09-29 02:04:22,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:23,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:04:25,344 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=214080.0, ans=0.0 2023-09-29 02:04:30,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:33,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 02:04:34,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:34,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:04:35,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:37,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:04:37,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 02:04:37,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:04:38,700 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 02:04:40,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:42,341 INFO [train.py:1039] (0/4) Epoch 7, batch 250, loss[loss=0.2322, simple_loss=0.3011, pruned_loss=0.08163, over 24315.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2979, pruned_loss=0.08512, over 3390373.33 frames. ], batch size: 61, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:04:42,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:04:43,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:44,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:45,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:04:45,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:48,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:04:51,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:05:00,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=214213.33333333334, ans=0.1 2023-09-29 02:05:03,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:06,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:06,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:05:15,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:05:16,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:05:18,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:05:18,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:19,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:05:19,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:05:20,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:23,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:05:24,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 02:05:24,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:27,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:05:27,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:05:27,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:05:29,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:05:29,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:05:29,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:05:32,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:34,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:05:34,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:35,828 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.070e+02 2.355e+02 2.714e+02 5.110e+02, threshold=4.709e+02, percent-clipped=2.0 2023-09-29 02:05:36,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=214346.66666666666, ans=0.2 2023-09-29 02:05:39,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:05:46,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:46,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=214413.33333333334, ans=0.125 2023-09-29 02:05:49,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:05:54,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:55,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:05:58,619 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.03 vs. limit=15.0 2023-09-29 02:05:59,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 02:05:59,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:59,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:06:02,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 02:06:02,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:06:03,347 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.61 vs. limit=6.0 2023-09-29 02:06:04,123 INFO [train.py:1039] (0/4) Epoch 7, batch 300, loss[loss=0.2081, simple_loss=0.2478, pruned_loss=0.08418, over 19493.00 frames. ], tot_loss[loss=0.2312, simple_loss=0.2941, pruned_loss=0.08414, over 3665416.57 frames. ], batch size: 390, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:06:04,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:06:04,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 02:06:07,214 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.27 vs. limit=22.5 2023-09-29 02:06:08,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=214480.0, ans=0.1 2023-09-29 02:06:09,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:10,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:12,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:06:14,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 02:06:16,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:06:18,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:06:18,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 02:06:18,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:21,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=214546.66666666666, ans=0.125 2023-09-29 02:06:23,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:06:23,862 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.71 vs. limit=6.0 2023-09-29 02:06:26,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:06:27,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 02:06:29,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=214546.66666666666, ans=0.125 2023-09-29 02:06:29,429 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.06 vs. limit=6.0 2023-09-29 02:06:30,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 02:06:30,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:33,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:34,242 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.02 vs. limit=22.5 2023-09-29 02:06:34,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:34,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 02:06:34,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:06:39,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:06:41,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:06:41,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:06:46,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:06:46,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 02:06:46,461 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=214613.33333333334, ans=0.0 2023-09-29 02:06:48,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:06:50,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:52,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 02:06:53,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:57,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:06:58,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:58,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 02:07:04,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:04,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:07:05,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=214680.0, ans=0.0 2023-09-29 02:07:07,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:10,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:07:10,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 02:07:10,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:07:11,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:11,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 02:07:15,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:17,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:18,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:18,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:18,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:25,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:25,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 02:07:27,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:29,104 INFO [train.py:1039] (0/4) Epoch 7, batch 350, loss[loss=0.2289, simple_loss=0.2842, pruned_loss=0.08682, over 23818.00 frames. ], tot_loss[loss=0.23, simple_loss=0.2922, pruned_loss=0.08393, over 3886708.53 frames. ], batch size: 212, lr: 1.52e-02, grad_scale: 16.0 2023-09-29 02:07:34,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:39,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:39,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:40,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 02:07:43,106 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.79 vs. limit=15.0 2023-09-29 02:07:43,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:43,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 02:07:45,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:45,735 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:07:46,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 02:07:47,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:50,041 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.28 vs. limit=12.0 2023-09-29 02:07:50,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 02:07:52,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:07:54,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:57,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:07:58,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:07:58,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:58,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:08:02,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:02,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:10,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:10,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:08:12,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:08:12,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:17,616 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.18 vs. limit=15.0 2023-09-29 02:08:18,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 02:08:18,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:24,063 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.070e+02 2.347e+02 2.700e+02 4.079e+02, threshold=4.694e+02, percent-clipped=0.0 2023-09-29 02:08:24,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:24,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:24,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:08:27,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 02:08:29,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:30,840 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 02:08:32,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 02:08:32,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:35,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:08:35,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 02:08:38,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:41,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:08:41,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:42,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:42,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:45,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:48,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:51,119 INFO [train.py:1039] (0/4) Epoch 7, batch 400, loss[loss=0.2451, simple_loss=0.3162, pruned_loss=0.08703, over 24445.00 frames. ], tot_loss[loss=0.2295, simple_loss=0.2924, pruned_loss=0.08333, over 4075895.15 frames. ], batch size: 69, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:08:51,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:08:51,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 02:08:51,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:52,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:54,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:08:54,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:08:57,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:58,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:01,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 02:09:02,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 02:09:02,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:04,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 02:09:04,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:10,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:09:10,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:10,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 02:09:10,567 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.53 vs. limit=12.0 2023-09-29 02:09:11,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:09:12,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:12,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:13,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:09:16,125 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 02:09:16,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 02:09:16,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215213.33333333334, ans=0.1 2023-09-29 02:09:21,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:22,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:24,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 02:09:24,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=215280.0, ans=0.125 2023-09-29 02:09:25,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 02:09:28,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:09:31,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:38,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 02:09:38,958 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=215346.66666666666, ans=0.09899494936611666 2023-09-29 02:09:41,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:09:41,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 02:09:45,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:47,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:09:47,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 02:09:48,386 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.46 vs. limit=10.0 2023-09-29 02:09:50,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:09:51,378 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:09:53,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:09:53,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215346.66666666666, ans=0.1 2023-09-29 02:09:54,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:56,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:56,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 02:09:59,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:09:59,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 02:10:02,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:10:02,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:10:04,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 02:10:07,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:10:07,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:10:07,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:10:09,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 02:10:09,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:10:11,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:10:11,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:10:11,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 02:10:12,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:10:12,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=215480.0, ans=0.0 2023-09-29 02:10:14,720 INFO [train.py:1039] (0/4) Epoch 7, batch 450, loss[loss=0.2495, simple_loss=0.3213, pruned_loss=0.08887, over 24130.00 frames. ], tot_loss[loss=0.2316, simple_loss=0.2942, pruned_loss=0.08447, over 4206446.96 frames. ], batch size: 80, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:10:14,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:10:18,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:10:20,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=215480.0, ans=0.125 2023-09-29 02:10:26,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=215480.0, ans=0.0 2023-09-29 02:10:29,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:29,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:10:29,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=215546.66666666666, ans=0.5 2023-09-29 02:10:32,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 02:10:32,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 02:10:37,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:10:38,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:40,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:44,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:45,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:47,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=215613.33333333334, ans=0.125 2023-09-29 02:10:48,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 02:10:48,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 02:10:48,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=215613.33333333334, ans=0.07 2023-09-29 02:10:50,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 02:10:52,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:10:52,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:52,468 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=215613.33333333334, ans=0.0 2023-09-29 02:10:54,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:10:57,291 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 02:10:57,315 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 02:10:57,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:58,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:11:00,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:11:00,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=215613.33333333334, ans=0.1 2023-09-29 02:11:04,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:11:04,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:11:04,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:11:04,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=215680.0, ans=0.0 2023-09-29 02:11:05,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 02:11:08,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:10,021 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.891e+02 2.155e+02 2.361e+02 4.169e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 02:11:10,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:11:10,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:11:11,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 02:11:16,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:11:18,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 02:11:18,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 02:11:20,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:23,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=215746.66666666666, ans=0.125 2023-09-29 02:11:26,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:11:26,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:30,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:11:30,286 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 02:11:33,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:34,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:11:35,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:35,060 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 02:11:36,812 INFO [train.py:1039] (0/4) Epoch 7, batch 500, loss[loss=0.2324, simple_loss=0.2991, pruned_loss=0.0828, over 24641.00 frames. ], tot_loss[loss=0.2322, simple_loss=0.295, pruned_loss=0.08475, over 4316334.82 frames. ], batch size: 65, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:11:37,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 02:11:37,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:41,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:11:44,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 02:11:46,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:11:47,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:48,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:48,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215813.33333333334, ans=0.1 2023-09-29 02:11:49,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:00,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=215880.0, ans=0.125 2023-09-29 02:12:01,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:01,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:12:03,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:12:03,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:03,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 02:12:05,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:12:07,363 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.01 vs. limit=22.5 2023-09-29 02:12:08,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:12:08,532 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=215946.66666666666, ans=10.0 2023-09-29 02:12:09,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:12:11,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:12:11,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:13,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 02:12:14,890 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 02:12:15,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=215946.66666666666, ans=0.125 2023-09-29 02:12:15,582 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.29 vs. limit=22.5 2023-09-29 02:12:17,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:19,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:21,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:21,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:22,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:12:24,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 02:12:28,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:12:29,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:34,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:12:38,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:45,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:50,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 02:12:50,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:50,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:51,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 02:12:53,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:12:53,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:59,425 INFO [train.py:1039] (0/4) Epoch 7, batch 550, loss[loss=0.2273, simple_loss=0.2902, pruned_loss=0.0822, over 23652.00 frames. ], tot_loss[loss=0.2324, simple_loss=0.2957, pruned_loss=0.08452, over 4416489.83 frames. ], batch size: 232, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:12:59,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 02:13:02,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 02:13:02,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:02,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 02:13:02,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:13:02,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:03,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=216146.66666666666, ans=0.5 2023-09-29 02:13:04,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:13:06,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:13:08,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:13:09,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 02:13:09,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:13:13,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=216146.66666666666, ans=0.125 2023-09-29 02:13:17,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:18,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:19,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:21,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:25,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=216213.33333333334, ans=0.0 2023-09-29 02:13:26,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=216213.33333333334, ans=0.125 2023-09-29 02:13:27,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 02:13:29,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 02:13:30,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:13:35,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:13:35,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:36,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:13:40,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:40,125 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 02:13:42,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:43,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:13:45,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:47,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:13:47,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:13:49,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:49,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 02:13:51,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 02:13:51,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=216346.66666666666, ans=0.125 2023-09-29 02:13:52,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:13:52,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:54,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:13:54,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:55,676 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.132e+02 2.384e+02 2.779e+02 4.607e+02, threshold=4.767e+02, percent-clipped=1.0 2023-09-29 02:13:56,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:13:59,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:14:01,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:14:02,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:02,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 02:14:04,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:14:05,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:05,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:14:07,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:08,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:14:09,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:14:10,730 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=216413.33333333334, ans=0.0 2023-09-29 02:14:13,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 02:14:19,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 02:14:22,411 INFO [train.py:1039] (0/4) Epoch 7, batch 600, loss[loss=0.2382, simple_loss=0.2987, pruned_loss=0.08881, over 23702.00 frames. ], tot_loss[loss=0.2329, simple_loss=0.2958, pruned_loss=0.08506, over 4472385.75 frames. ], batch size: 135, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:14:22,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:14:22,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:14:22,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:28,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=216480.0, ans=0.0 2023-09-29 02:14:30,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:14:34,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:14:34,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 02:14:37,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:14:39,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:14:42,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:43,975 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=216546.66666666666, ans=0.125 2023-09-29 02:14:45,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 02:14:45,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:14:45,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=216546.66666666666, ans=0.125 2023-09-29 02:14:47,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=216546.66666666666, ans=0.0 2023-09-29 02:14:50,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 02:14:55,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:14:55,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:56,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:15:01,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:15:01,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:15:03,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:10,980 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.46 vs. limit=22.5 2023-09-29 02:15:12,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:15:14,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:16,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:15:16,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:15:22,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 02:15:27,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:15:27,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:15:33,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 02:15:33,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:15:37,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 02:15:38,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:15:39,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:15:44,075 INFO [train.py:1039] (0/4) Epoch 7, batch 650, loss[loss=0.2307, simple_loss=0.3055, pruned_loss=0.07796, over 24664.00 frames. ], tot_loss[loss=0.2319, simple_loss=0.2949, pruned_loss=0.08446, over 4529799.07 frames. ], batch size: 73, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:15:45,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:15:47,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:15:48,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:15:51,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:15:51,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:15:54,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 02:15:56,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:16:01,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:16:01,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:04,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:07,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 02:16:11,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:11,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:16,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:16,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:16:18,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:18,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=216946.66666666666, ans=0.07 2023-09-29 02:16:19,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:21,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:16:21,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:22,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:16:24,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:16:25,736 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 02:16:25,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:25,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:28,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:30,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:30,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:31,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:16:34,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 02:16:34,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:16:34,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:16:36,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:16:37,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:38,836 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.233e+02 2.449e+02 2.795e+02 3.907e+02, threshold=4.898e+02, percent-clipped=0.0 2023-09-29 02:16:38,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:16:40,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 02:16:40,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 02:16:42,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:42,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:42,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:16:42,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:44,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:51,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:51,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:52,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:54,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:54,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:16:56,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:17:00,737 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=217080.0, ans=0.125 2023-09-29 02:17:02,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:17:02,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:03,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:03,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:05,227 INFO [train.py:1039] (0/4) Epoch 7, batch 700, loss[loss=0.2413, simple_loss=0.2986, pruned_loss=0.092, over 23629.00 frames. ], tot_loss[loss=0.2307, simple_loss=0.2926, pruned_loss=0.0844, over 4552572.42 frames. ], batch size: 149, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:17:10,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 02:17:10,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 02:17:14,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 02:17:15,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:18,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:17:20,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 02:17:24,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:27,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:17:29,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:30,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:17:31,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:17:34,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:37,268 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=22.5 2023-09-29 02:17:37,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:17:37,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:17:40,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 02:17:45,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 02:17:48,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:17:50,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:17:51,046 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.75 vs. limit=15.0 2023-09-29 02:17:52,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:17:55,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=217346.66666666666, ans=0.125 2023-09-29 02:17:57,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:17:57,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 02:18:01,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:02,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:18:02,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 02:18:06,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:18:08,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:11,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:18:14,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:18:14,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 02:18:18,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 02:18:20,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 02:18:23,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:25,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:25,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:18:29,129 INFO [train.py:1039] (0/4) Epoch 7, batch 750, loss[loss=0.2387, simple_loss=0.3161, pruned_loss=0.08071, over 24562.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.2929, pruned_loss=0.08394, over 4596185.43 frames. ], batch size: 71, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:18:29,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:29,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 02:18:33,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 02:18:33,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 02:18:33,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 02:18:35,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 02:18:35,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 02:18:35,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:18:38,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 02:18:38,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:39,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:18:41,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:42,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:44,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:18:44,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:48,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:18:50,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:18:52,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:18:54,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:56,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:56,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 02:18:56,566 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:18:57,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:18:57,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:00,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:01,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:19:02,154 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=217613.33333333334, ans=0.0 2023-09-29 02:19:03,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 02:19:03,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:05,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 02:19:05,527 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 02:19:05,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 02:19:05,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:19:07,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:19:07,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:19:14,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:19:14,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:14,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:19:17,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:19:19,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:19,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 02:19:21,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:19:23,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 02:19:23,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:19:24,700 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.977e+02 2.216e+02 2.498e+02 3.508e+02, threshold=4.433e+02, percent-clipped=0.0 2023-09-29 02:19:24,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:19:26,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 02:19:28,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:32,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:19:33,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:19:33,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:19:35,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:19:40,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 02:19:41,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:43,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:45,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:46,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:48,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:48,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:19:51,137 INFO [train.py:1039] (0/4) Epoch 7, batch 800, loss[loss=0.2164, simple_loss=0.2696, pruned_loss=0.0816, over 23533.00 frames. ], tot_loss[loss=0.2302, simple_loss=0.293, pruned_loss=0.0837, over 4621802.97 frames. ], batch size: 134, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:19:51,597 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=217813.33333333334, ans=0.125 2023-09-29 02:19:56,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=217813.33333333334, ans=0.125 2023-09-29 02:19:57,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:57,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:59,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:59,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:00,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:00,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:04,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:06,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=217880.0, ans=0.0 2023-09-29 02:20:08,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:09,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:20:12,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 02:20:12,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:12,971 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=217880.0, ans=0.2 2023-09-29 02:20:14,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:14,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:20:15,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:15,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 02:20:15,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:17,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 02:20:19,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:21,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:23,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:20:23,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:25,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:25,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:28,481 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=217946.66666666666, ans=0.2 2023-09-29 02:20:30,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:20:31,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:20:31,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 02:20:33,951 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 02:20:33,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 02:20:35,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:20:35,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:20:37,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:39,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:20:44,428 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 02:20:44,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 02:20:44,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:20:44,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=218013.33333333334, ans=0.0 2023-09-29 02:20:47,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:20:47,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=218013.33333333334, ans=0.2 2023-09-29 02:20:51,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:20:55,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:56,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 02:20:56,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:21:01,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 02:21:07,412 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.95 vs. limit=8.0 2023-09-29 02:21:07,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:10,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:21:11,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 02:21:12,919 INFO [train.py:1039] (0/4) Epoch 7, batch 850, loss[loss=0.2251, simple_loss=0.3056, pruned_loss=0.07228, over 24316.00 frames. ], tot_loss[loss=0.2297, simple_loss=0.2931, pruned_loss=0.08316, over 4649314.36 frames. ], batch size: 74, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:21:13,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:21:15,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:15,386 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=218146.66666666666, ans=0.0 2023-09-29 02:21:16,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 02:21:16,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:18,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:21:18,426 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=218146.66666666666, ans=0.125 2023-09-29 02:21:19,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:19,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:21:20,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=218146.66666666666, ans=0.2 2023-09-29 02:21:21,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:21:22,290 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.02 vs. limit=10.0 2023-09-29 02:21:22,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 02:21:23,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 02:21:23,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 02:21:24,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:26,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:21:29,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:29,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:29,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:21:32,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:33,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:21:33,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 02:21:35,690 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=218213.33333333334, ans=0.0 2023-09-29 02:21:38,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 02:21:40,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:42,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 02:21:45,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 02:21:47,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 02:21:51,162 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 02:21:51,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:21:51,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:21:51,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:21:54,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:55,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:57,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 02:21:58,966 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=218280.0, ans=0.125 2023-09-29 02:22:00,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:22:00,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:01,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:22:01,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:22:02,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:22:03,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:22:05,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 02:22:09,497 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.250e+02 2.603e+02 3.078e+02 4.971e+02, threshold=5.207e+02, percent-clipped=2.0 2023-09-29 02:22:09,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:22:09,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:11,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:22:11,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:12,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:12,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=218346.66666666666, ans=0.02 2023-09-29 02:22:16,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:22:18,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:22:19,436 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.76 vs. limit=10.0 2023-09-29 02:22:20,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:22:20,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:20,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:22:28,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:22:29,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:29,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 02:22:29,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=218413.33333333334, ans=0.125 2023-09-29 02:22:31,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:31,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:32,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 02:22:35,486 INFO [train.py:1039] (0/4) Epoch 7, batch 900, loss[loss=0.2305, simple_loss=0.2889, pruned_loss=0.08603, over 23143.00 frames. ], tot_loss[loss=0.2312, simple_loss=0.2947, pruned_loss=0.08392, over 4658785.44 frames. ], batch size: 105, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:22:37,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:22:40,814 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=218480.0, ans=15.0 2023-09-29 02:22:41,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:41,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 02:22:43,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=218480.0, ans=0.05 2023-09-29 02:22:44,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:22:45,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 02:22:46,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=218480.0, ans=0.125 2023-09-29 02:22:48,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:22:50,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:50,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:22:50,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:22:51,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:22:55,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=218546.66666666666, ans=0.0 2023-09-29 02:23:01,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:01,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:23:01,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:23:04,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:10,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 02:23:12,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:23:16,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:23:18,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:23:18,927 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 02:23:19,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 02:23:26,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:23:26,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:23:28,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:23:35,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:35,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:23:37,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 02:23:37,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:40,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 02:23:43,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:23:43,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:45,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:23:45,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:23:48,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 02:23:50,121 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 02:23:51,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:23:51,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 02:23:51,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=218746.66666666666, ans=0.0 2023-09-29 02:23:53,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:56,775 INFO [train.py:1039] (0/4) Epoch 7, batch 950, loss[loss=0.2362, simple_loss=0.2802, pruned_loss=0.0961, over 19789.00 frames. ], tot_loss[loss=0.231, simple_loss=0.2947, pruned_loss=0.08367, over 4676993.48 frames. ], batch size: 388, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:23:58,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 02:24:03,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:05,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:05,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:05,959 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=16.56 vs. limit=15.0 2023-09-29 02:24:06,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:24:08,423 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 02:24:12,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:13,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:14,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:15,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:24:15,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 02:24:16,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:24:18,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:19,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 02:24:19,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:24,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:24:25,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 02:24:27,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 02:24:31,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:32,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:24:32,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=218946.66666666666, ans=0.125 2023-09-29 02:24:39,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:24:39,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:41,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 02:24:43,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:24:43,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:24:44,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:44,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:44,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:24:50,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 02:24:50,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:24:53,268 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.008e+02 2.208e+02 2.603e+02 6.954e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 02:24:54,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:54,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:54,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 02:24:54,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:54,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:24:56,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 02:24:59,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:25:01,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:25:08,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:08,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 02:25:09,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 02:25:13,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:25:14,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=219080.0, ans=0.125 2023-09-29 02:25:17,793 INFO [train.py:1039] (0/4) Epoch 7, batch 1000, loss[loss=0.2212, simple_loss=0.2985, pruned_loss=0.07197, over 24633.00 frames. ], tot_loss[loss=0.2301, simple_loss=0.293, pruned_loss=0.08356, over 4677732.31 frames. ], batch size: 68, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:25:19,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 02:25:19,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:22,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:25:23,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=219146.66666666666, ans=0.1 2023-09-29 02:25:25,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 02:25:25,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 02:25:30,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:30,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:32,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:34,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 02:25:39,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 02:25:41,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 02:25:43,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:25:44,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 02:25:46,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 02:25:46,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 02:25:48,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:49,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:51,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=219280.0, ans=0.2 2023-09-29 02:25:57,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:58,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:25:59,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:59,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:59,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 02:25:59,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:01,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:26:01,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:26:02,554 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 02:26:06,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 02:26:06,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 02:26:07,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 02:26:11,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:26:18,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:18,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:26:20,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:21,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:26:21,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 02:26:24,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:26:25,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 02:26:25,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 02:26:27,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:26:27,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:26:27,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=219413.33333333334, ans=0.125 2023-09-29 02:26:29,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:26:31,944 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.26 vs. limit=15.0 2023-09-29 02:26:34,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:26:37,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:40,868 INFO [train.py:1039] (0/4) Epoch 7, batch 1050, loss[loss=0.1945, simple_loss=0.2615, pruned_loss=0.06371, over 24436.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2915, pruned_loss=0.08227, over 4704676.31 frames. ], batch size: 58, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:26:40,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:26:41,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:26:42,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:26:43,354 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.21 vs. limit=22.5 2023-09-29 02:26:44,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:46,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:26:49,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:26:50,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:26:53,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:26:54,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:26:54,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:26:56,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:26:56,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 02:26:56,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=219546.66666666666, ans=0.1 2023-09-29 02:26:57,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:26:57,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 02:27:00,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:27:00,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 02:27:02,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:27:06,257 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=219546.66666666666, ans=0.125 2023-09-29 02:27:07,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=219546.66666666666, ans=0.0 2023-09-29 02:27:09,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:27:09,815 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.70 vs. limit=15.0 2023-09-29 02:27:10,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:27:10,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:27:14,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 02:27:14,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 02:27:15,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:27:19,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 02:27:21,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 02:27:22,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:22,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=219613.33333333334, ans=0.125 2023-09-29 02:27:26,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:27:28,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:27:29,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:27:31,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:27:33,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=219680.0, ans=0.125 2023-09-29 02:27:34,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:27:37,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 02:27:39,289 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.034e+02 2.317e+02 2.689e+02 3.658e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 02:27:39,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 02:27:39,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 02:27:39,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:40,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:27:42,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 02:27:47,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:27:49,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:49,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:27:49,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:49,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:55,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:55,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 02:27:56,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:56,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 02:27:56,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 02:27:57,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=219746.66666666666, ans=0.04949747468305833 2023-09-29 02:27:58,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:28:00,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:01,113 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=219746.66666666666, ans=0.125 2023-09-29 02:28:02,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=219813.33333333334, ans=0.0 2023-09-29 02:28:03,721 INFO [train.py:1039] (0/4) Epoch 7, batch 1100, loss[loss=0.2693, simple_loss=0.3, pruned_loss=0.1194, over 19433.00 frames. ], tot_loss[loss=0.2277, simple_loss=0.2916, pruned_loss=0.08185, over 4709666.35 frames. ], batch size: 388, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:28:05,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:28:07,640 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=15.0 2023-09-29 02:28:10,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:28:10,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:28:12,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:12,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 02:28:15,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:28:18,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:28:18,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=219880.0, ans=0.04949747468305833 2023-09-29 02:28:20,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:28:22,155 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:28:25,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:28:25,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 02:28:25,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:28:27,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:27,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:28:32,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:28:33,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:28:37,760 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.93 vs. limit=22.5 2023-09-29 02:28:39,027 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.86 vs. limit=12.0 2023-09-29 02:28:40,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:28:43,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 02:28:43,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=219946.66666666666, ans=0.05 2023-09-29 02:28:44,494 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 02:28:44,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:47,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:49,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:28:49,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:51,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 02:28:52,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:28:52,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:28:52,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:28:54,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:54,646 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=220013.33333333334, ans=0.125 2023-09-29 02:28:55,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 02:28:56,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=220013.33333333334, ans=0.04949747468305833 2023-09-29 02:29:00,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:29:00,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=220013.33333333334, ans=0.125 2023-09-29 02:29:01,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 02:29:03,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:29:03,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=220013.33333333334, ans=0.2 2023-09-29 02:29:06,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:29:10,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 02:29:10,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:29:11,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:13,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:14,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:16,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 02:29:16,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:29:16,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:18,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 02:29:18,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:29:19,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 02:29:21,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:29:21,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:29:22,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:29:26,263 INFO [train.py:1039] (0/4) Epoch 7, batch 1150, loss[loss=0.2438, simple_loss=0.3022, pruned_loss=0.09269, over 23627.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2923, pruned_loss=0.08187, over 4706846.68 frames. ], batch size: 149, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:29:28,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:30,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:29:33,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:34,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:29:35,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 02:29:35,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:38,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 02:29:38,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:38,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:29:45,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 02:29:48,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:51,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:52,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:29:52,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 02:29:53,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:29:53,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:59,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 02:29:59,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:30:01,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:30:11,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=220280.0, ans=0.0 2023-09-29 02:30:13,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:18,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=220346.66666666666, ans=0.0 2023-09-29 02:30:21,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:21,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 02:30:22,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:24,042 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 2.131e+02 2.399e+02 2.911e+02 4.367e+02, threshold=4.797e+02, percent-clipped=0.0 2023-09-29 02:30:24,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:26,279 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.20 vs. limit=15.0 2023-09-29 02:30:28,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=220346.66666666666, ans=0.1 2023-09-29 02:30:30,149 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 02:30:30,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:32,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=220413.33333333334, ans=0.2 2023-09-29 02:30:37,234 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 02:30:42,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:43,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:30:43,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:30:43,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:30:46,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:30:48,884 INFO [train.py:1039] (0/4) Epoch 7, batch 1200, loss[loss=0.2206, simple_loss=0.2918, pruned_loss=0.07473, over 24479.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.2932, pruned_loss=0.0826, over 4709461.58 frames. ], batch size: 66, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:30:50,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:30:50,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:30:54,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:30:54,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:55,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:30:57,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:30:58,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:31:00,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:00,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:03,471 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 02:31:06,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 02:31:10,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:31:13,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:31:16,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:18,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:31:18,893 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 02:31:19,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:27,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:31:27,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:31:27,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 02:31:29,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:31:29,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=220613.33333333334, ans=0.125 2023-09-29 02:31:32,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 02:31:34,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=220613.33333333334, ans=0.0 2023-09-29 02:31:38,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 02:31:38,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:38,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:39,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:40,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:31:41,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:41,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:31:41,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:31:43,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 02:31:43,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:31:43,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:31:44,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:31:49,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:49,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:54,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:31:57,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:32:00,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 02:32:03,886 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 02:32:05,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:08,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:32:10,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:32:11,587 INFO [train.py:1039] (0/4) Epoch 7, batch 1250, loss[loss=0.1974, simple_loss=0.2739, pruned_loss=0.0605, over 24301.00 frames. ], tot_loss[loss=0.2286, simple_loss=0.2928, pruned_loss=0.08217, over 4705222.29 frames. ], batch size: 61, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:32:11,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:32:14,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 02:32:17,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:32:19,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:19,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 02:32:22,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:32:24,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:32:27,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:32:28,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:29,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:32:29,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:31,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=220880.0, ans=0.125 2023-09-29 02:32:32,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:32:35,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 02:32:36,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:32:36,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:36,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:38,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:41,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:32:41,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:32:46,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 02:32:46,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:32:49,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:32:50,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 02:32:52,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:52,876 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 02:32:52,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:54,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:57,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:33:04,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:33:04,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:33:04,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 02:33:05,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 02:33:05,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 02:33:08,746 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.040e+02 2.260e+02 2.662e+02 4.055e+02, threshold=4.521e+02, percent-clipped=0.0 2023-09-29 02:33:10,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:12,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 02:33:12,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:15,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:33:15,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:33:18,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 02:33:18,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:33:20,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:33:20,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:33:20,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:33:22,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 02:33:25,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:26,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:33:27,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:33:28,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=221080.0, ans=0.0 2023-09-29 02:33:30,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:33:32,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:32,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 02:33:34,492 INFO [train.py:1039] (0/4) Epoch 7, batch 1300, loss[loss=0.2278, simple_loss=0.2798, pruned_loss=0.08796, over 23814.00 frames. ], tot_loss[loss=0.2299, simple_loss=0.2939, pruned_loss=0.08301, over 4708135.65 frames. ], batch size: 164, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:33:39,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:40,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:33:42,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:33:43,083 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=12.0 2023-09-29 02:33:43,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:45,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:33:47,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 02:33:50,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:33:52,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:33:53,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 02:33:58,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:34:00,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=221213.33333333334, ans=0.5 2023-09-29 02:34:03,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:05,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:06,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:34:07,845 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.96 vs. limit=15.0 2023-09-29 02:34:08,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:10,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:34:10,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:34:10,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 02:34:19,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:34:19,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:34:19,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 02:34:20,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:34:22,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:34:24,626 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=221346.66666666666, ans=0.125 2023-09-29 02:34:25,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:34:27,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 02:34:27,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:27,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 02:34:29,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:33,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:33,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:34:36,111 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.27 vs. limit=10.0 2023-09-29 02:34:36,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 02:34:39,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 02:34:40,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 02:34:45,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:34:48,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=221413.33333333334, ans=0.125 2023-09-29 02:34:49,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 02:34:50,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:57,754 INFO [train.py:1039] (0/4) Epoch 7, batch 1350, loss[loss=0.2261, simple_loss=0.28, pruned_loss=0.08604, over 23734.00 frames. ], tot_loss[loss=0.229, simple_loss=0.2933, pruned_loss=0.08237, over 4704115.77 frames. ], batch size: 212, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:34:58,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 02:35:02,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:04,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:07,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:35:08,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:10,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:35:10,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:15,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:17,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 02:35:18,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:20,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:35:22,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 02:35:24,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:35:25,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:35:25,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 02:35:27,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 02:35:29,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 02:35:31,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:31,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 02:35:36,908 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.53 vs. limit=15.0 2023-09-29 02:35:42,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:46,531 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=221680.0, ans=0.1 2023-09-29 02:35:51,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:51,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:51,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=221680.0, ans=0.0 2023-09-29 02:35:52,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 02:35:55,566 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.214e+02 2.683e+02 3.104e+02 4.290e+02, threshold=5.366e+02, percent-clipped=0.0 2023-09-29 02:35:55,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:57,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 02:35:57,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:59,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:36:01,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:36:03,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 02:36:04,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:36:08,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 02:36:09,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 02:36:17,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 02:36:19,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:36:20,678 INFO [train.py:1039] (0/4) Epoch 7, batch 1400, loss[loss=0.1985, simple_loss=0.2641, pruned_loss=0.06645, over 24325.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2918, pruned_loss=0.08211, over 4708346.30 frames. ], batch size: 56, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:36:23,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:36:23,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:36:26,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=221813.33333333334, ans=0.1 2023-09-29 02:36:27,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 02:36:31,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 02:36:35,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=221813.33333333334, ans=0.0 2023-09-29 02:36:37,505 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.84 vs. limit=15.0 2023-09-29 02:36:39,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:36:41,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:36:44,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:36:44,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:36:49,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:36:50,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:36:54,095 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=221946.66666666666, ans=0.05 2023-09-29 02:37:00,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:02,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:07,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 02:37:09,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:37:09,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:37:09,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:37:09,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=222013.33333333334, ans=0.125 2023-09-29 02:37:11,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:12,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:37:12,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:37:14,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:37:15,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 02:37:17,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:37:18,781 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:37:20,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:23,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:37:31,770 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=222080.0, ans=0.0 2023-09-29 02:37:32,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 02:37:33,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:37:35,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:37:36,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:37:38,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:38,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:37:43,600 INFO [train.py:1039] (0/4) Epoch 7, batch 1450, loss[loss=0.2061, simple_loss=0.2864, pruned_loss=0.06288, over 24492.00 frames. ], tot_loss[loss=0.2268, simple_loss=0.2905, pruned_loss=0.08148, over 4701593.49 frames. ], batch size: 66, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:37:43,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:37:45,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=222146.66666666666, ans=0.09899494936611666 2023-09-29 02:37:46,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:37:46,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:46,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:37:51,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:53,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:37:54,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:54,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 02:37:56,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:37:56,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 02:37:58,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:59,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:37:59,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 02:37:59,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:37:59,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:38:01,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 02:38:01,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:01,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:38:04,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:08,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:11,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:38:11,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:38:14,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:38:15,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:18,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:18,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:38:18,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:20,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:23,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 02:38:26,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:38:29,591 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 02:38:32,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:32,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:38:34,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:36,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 02:38:36,705 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:38:39,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:41,373 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 2.076e+02 2.226e+02 2.557e+02 3.542e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 02:38:41,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 02:38:43,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 02:38:44,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:44,884 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=222346.66666666666, ans=0.0 2023-09-29 02:38:47,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:38:47,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=222413.33333333334, ans=0.125 2023-09-29 02:38:49,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:49,512 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=222413.33333333334, ans=0.125 2023-09-29 02:38:51,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 02:38:53,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 02:38:53,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 02:38:55,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:56,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:39:06,104 INFO [train.py:1039] (0/4) Epoch 7, batch 1500, loss[loss=0.2298, simple_loss=0.3065, pruned_loss=0.07656, over 24456.00 frames. ], tot_loss[loss=0.227, simple_loss=0.2915, pruned_loss=0.0812, over 4717293.19 frames. ], batch size: 69, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:39:09,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 02:39:09,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:39:09,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:39:11,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:11,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:13,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:39:13,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 02:39:16,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:39:16,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:39:16,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:18,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:39:19,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:39:21,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:22,397 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=15.24 vs. limit=22.5 2023-09-29 02:39:25,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:27,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 02:39:27,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:39:27,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:39:29,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:32,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 02:39:35,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 02:39:38,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:38,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 02:39:41,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:39:43,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:39:45,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:45,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:39:45,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 02:39:46,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:39:46,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:48,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 02:39:50,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:55,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=222680.0, ans=0.1 2023-09-29 02:39:56,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:39:56,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 02:40:02,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=222680.0, ans=0.1 2023-09-29 02:40:03,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:40:05,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:40:09,833 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 02:40:09,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:09,926 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 02:40:11,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:13,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:13,149 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 02:40:14,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:40:19,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 02:40:21,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:24,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:24,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:24,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:25,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:25,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:40:27,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 02:40:28,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 02:40:29,317 INFO [train.py:1039] (0/4) Epoch 7, batch 1550, loss[loss=0.3285, simple_loss=0.3573, pruned_loss=0.1498, over 19239.00 frames. ], tot_loss[loss=0.2294, simple_loss=0.2935, pruned_loss=0.08262, over 4713536.84 frames. ], batch size: 388, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:40:29,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:40:30,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 02:40:31,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 02:40:31,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=222813.33333333334, ans=0.1 2023-09-29 02:40:32,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:36,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:36,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:40:36,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:40:37,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:39,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:40,937 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 02:40:42,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:42,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:40:43,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:40:45,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:40:45,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 02:40:45,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=222880.0, ans=0.125 2023-09-29 02:40:47,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:48,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 02:40:48,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 02:40:48,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 02:40:50,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:51,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:40:56,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:59,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 02:40:59,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 02:41:00,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=222946.66666666666, ans=0.125 2023-09-29 02:41:03,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=222946.66666666666, ans=0.2 2023-09-29 02:41:07,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:13,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:41:13,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:41:13,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:41:13,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 02:41:19,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:41:22,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:24,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:41:24,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:41:24,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=223013.33333333334, ans=0.1 2023-09-29 02:41:25,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:25,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 02:41:25,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:27,783 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.115e+02 2.414e+02 2.837e+02 4.599e+02, threshold=4.828e+02, percent-clipped=1.0 2023-09-29 02:41:29,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:41:29,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:30,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:41:30,893 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 02:41:33,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:37,151 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=223080.0, ans=0.125 2023-09-29 02:41:37,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=223080.0, ans=0.0 2023-09-29 02:41:40,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 02:41:45,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:45,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:47,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 02:41:48,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:50,271 INFO [train.py:1039] (0/4) Epoch 7, batch 1600, loss[loss=0.2153, simple_loss=0.2829, pruned_loss=0.07387, over 24342.00 frames. ], tot_loss[loss=0.2301, simple_loss=0.2937, pruned_loss=0.08328, over 4714397.95 frames. ], batch size: 61, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:41:50,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:50,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:41:50,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:41:51,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:41:54,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:55,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 02:41:56,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 02:41:58,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 02:41:58,948 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.93 vs. limit=10.0 2023-09-29 02:41:59,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:02,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=223146.66666666666, ans=0.2 2023-09-29 02:42:03,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 02:42:03,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:42:06,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:42:11,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:42:15,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 02:42:18,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:42:19,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 02:42:19,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:19,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 02:42:27,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 02:42:32,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=223280.0, ans=10.0 2023-09-29 02:42:33,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:35,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 02:42:35,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:37,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:37,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:42:40,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 02:42:43,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 02:42:46,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:42:46,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:46,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:48,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:42:50,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:42:51,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:42:54,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:43:00,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:02,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:02,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=223413.33333333334, ans=0.0 2023-09-29 02:43:03,872 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=223413.33333333334, ans=0.125 2023-09-29 02:43:05,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 02:43:05,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:43:06,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 02:43:11,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:13,095 INFO [train.py:1039] (0/4) Epoch 7, batch 1650, loss[loss=0.1962, simple_loss=0.2712, pruned_loss=0.06063, over 24504.00 frames. ], tot_loss[loss=0.23, simple_loss=0.2934, pruned_loss=0.08323, over 4712389.77 frames. ], batch size: 63, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:43:14,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:43:14,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:43:14,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 02:43:14,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 02:43:14,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 02:43:14,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 02:43:19,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:21,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:21,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=223480.0, ans=0.125 2023-09-29 02:43:22,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:43:22,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:43:26,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:26,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=223480.0, ans=0.125 2023-09-29 02:43:27,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 02:43:30,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:43:30,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:30,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:43:30,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:43:32,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 02:43:33,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 02:43:39,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:43:41,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:43:48,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 02:43:50,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:43:51,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 02:43:56,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:43:59,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:43:59,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:59,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:01,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:44:01,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:05,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:05,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:07,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:07,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:08,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:08,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=223680.0, ans=0.125 2023-09-29 02:44:10,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:44:13,134 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.058e+02 2.405e+02 2.744e+02 4.179e+02, threshold=4.810e+02, percent-clipped=0.0 2023-09-29 02:44:13,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:13,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 02:44:16,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:16,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=223680.0, ans=0.125 2023-09-29 02:44:18,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 02:44:18,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 02:44:18,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 02:44:19,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:21,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:44:21,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:22,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:22,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 02:44:24,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=223746.66666666666, ans=0.125 2023-09-29 02:44:26,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:27,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:44:27,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:29,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 02:44:34,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:34,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:44:34,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=223813.33333333334, ans=0.125 2023-09-29 02:44:35,909 INFO [train.py:1039] (0/4) Epoch 7, batch 1700, loss[loss=0.1889, simple_loss=0.2663, pruned_loss=0.05577, over 24284.00 frames. ], tot_loss[loss=0.2285, simple_loss=0.2919, pruned_loss=0.08255, over 4714421.78 frames. ], batch size: 61, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:44:36,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 02:44:37,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:44:37,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:44:37,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:39,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:44:39,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:44:39,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 02:44:42,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:44:51,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:52,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:44:56,081 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:44:57,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=223880.0, ans=0.125 2023-09-29 02:44:58,308 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=15.0 2023-09-29 02:44:58,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:44:58,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:44:59,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:45:00,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:00,816 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=223880.0, ans=0.2 2023-09-29 02:45:04,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=223880.0, ans=0.125 2023-09-29 02:45:05,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 02:45:07,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:45:07,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:08,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:45:10,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:45:12,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 02:45:14,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 02:45:15,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:16,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 02:45:17,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:45:21,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=223946.66666666666, ans=0.125 2023-09-29 02:45:27,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:28,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:30,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:45:33,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:45:33,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 02:45:33,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:35,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:35,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 02:45:36,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:45:36,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:36,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:36,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:45:41,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:41,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:45:41,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:41,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:45:43,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:48,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:50,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 02:45:52,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:52,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:55,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 02:45:58,957 INFO [train.py:1039] (0/4) Epoch 7, batch 1750, loss[loss=0.2059, simple_loss=0.2608, pruned_loss=0.07546, over 23533.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.2904, pruned_loss=0.08199, over 4710988.20 frames. ], batch size: 256, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:46:00,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=224146.66666666666, ans=0.125 2023-09-29 02:46:01,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:03,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:04,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:46:05,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 02:46:06,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:46:09,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:46:09,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:14,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 02:46:17,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:19,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 02:46:21,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:21,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:46:21,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=224213.33333333334, ans=0.0 2023-09-29 02:46:25,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:46:26,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 02:46:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:46:28,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 02:46:29,518 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.54 vs. limit=10.0 2023-09-29 02:46:36,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:46:37,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=224280.0, ans=0.2 2023-09-29 02:46:41,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:46:41,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:44,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:44,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:46,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:46:48,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:51,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:51,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=224346.66666666666, ans=0.0 2023-09-29 02:46:52,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:52,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 02:46:55,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:58,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 02:46:58,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:46:59,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=224346.66666666666, ans=0.95 2023-09-29 02:47:00,216 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 2.137e+02 2.415e+02 2.786e+02 3.944e+02, threshold=4.830e+02, percent-clipped=0.0 2023-09-29 02:47:00,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:01,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:47:03,787 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:47:05,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:47:07,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:47:08,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:09,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:47:11,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=224413.33333333334, ans=0.07 2023-09-29 02:47:13,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:17,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:17,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:47:19,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 02:47:19,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:21,253 INFO [train.py:1039] (0/4) Epoch 7, batch 1800, loss[loss=0.2195, simple_loss=0.2962, pruned_loss=0.07134, over 24657.00 frames. ], tot_loss[loss=0.2269, simple_loss=0.2901, pruned_loss=0.08183, over 4706200.53 frames. ], batch size: 73, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:47:21,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:47:21,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:21,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:47:21,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:47:22,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:47:24,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:47:26,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:27,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:47:29,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:33,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:47:35,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:47:38,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:41,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:41,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:43,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:47:46,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:46,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 02:47:46,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:46,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=224546.66666666666, ans=0.125 2023-09-29 02:47:49,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:49,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=224546.66666666666, ans=0.0 2023-09-29 02:47:52,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 02:47:54,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 02:47:55,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 02:47:56,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:57,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:57,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:59,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:48:06,744 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 02:48:08,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:48:11,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:12,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 02:48:14,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 02:48:14,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:48:16,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:48:16,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:48:22,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 02:48:26,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:48:27,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 02:48:27,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:48:27,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:29,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:48:29,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 02:48:32,241 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.71 vs. limit=15.0 2023-09-29 02:48:32,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:48:32,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:48:35,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 02:48:35,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:38,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:38,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:48:38,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:48:43,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=224813.33333333334, ans=0.125 2023-09-29 02:48:45,589 INFO [train.py:1039] (0/4) Epoch 7, batch 1850, loss[loss=0.2217, simple_loss=0.2854, pruned_loss=0.07895, over 23637.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2902, pruned_loss=0.08162, over 4714950.20 frames. ], batch size: 149, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:48:45,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:48:45,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:47,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:48:48,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:48:53,551 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=224813.33333333334, ans=0.0 2023-09-29 02:48:55,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:48:56,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 02:48:59,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 02:49:02,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 02:49:06,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:07,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 02:49:07,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:49:08,228 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.40 vs. limit=15.0 2023-09-29 02:49:19,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:49:21,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 02:49:25,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:49:25,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:49:25,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=224946.66666666666, ans=0.0 2023-09-29 02:49:28,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 02:49:28,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:29,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:49:31,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:49:33,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:49:36,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:49:40,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:49:41,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:41,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:49:41,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:41,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=225013.33333333334, ans=0.1 2023-09-29 02:49:44,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:46,005 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.062e+02 2.233e+02 2.588e+02 4.432e+02, threshold=4.466e+02, percent-clipped=0.0 2023-09-29 02:49:46,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:49:51,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 02:49:51,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:55,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:49:55,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=225080.0, ans=0.2 2023-09-29 02:49:57,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:49:57,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 02:49:57,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 02:49:58,659 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 02:50:02,034 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 02:50:03,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:50:03,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:50:03,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:03,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,122 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 02:50:05,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:50:05,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:50:08,211 INFO [train.py:1039] (0/4) Epoch 7, batch 1900, loss[loss=0.2334, simple_loss=0.2941, pruned_loss=0.08631, over 23799.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.2907, pruned_loss=0.08119, over 4716076.38 frames. ], batch size: 85, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:50:08,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:50:09,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:50:09,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 02:50:13,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:13,470 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 02:50:13,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:50:14,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:21,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:24,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:50:24,820 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 02:50:26,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 02:50:27,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:27,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:50:27,948 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 02:50:29,502 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 02:50:33,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 02:50:36,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:50:39,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 02:50:42,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 02:50:54,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 02:50:55,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 02:50:55,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:57,396 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 02:50:57,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 02:50:57,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 02:50:57,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 02:50:57,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:02,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 02:51:05,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:51:08,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:08,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 02:51:12,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:51:15,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 02:51:17,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:22,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:51:22,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:51:22,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:51:24,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:51:26,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:51:27,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:51:27,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:51:30,588 INFO [train.py:1039] (0/4) Epoch 7, batch 1950, loss[loss=0.1968, simple_loss=0.2658, pruned_loss=0.06392, over 24606.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.2917, pruned_loss=0.08147, over 4728424.65 frames. ], batch size: 60, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:51:30,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:30,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:51:34,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:51:34,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:34,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:35,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:40,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:42,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:51:44,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:44,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:51:45,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 02:51:45,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:51:47,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:47,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:49,721 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.22 vs. limit=15.0 2023-09-29 02:51:50,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:51:50,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:51:50,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:54,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:51:55,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:57,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:51:57,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:51:57,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:00,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:04,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=225613.33333333334, ans=0.0 2023-09-29 02:52:05,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:52:05,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:05,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:52:05,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 02:52:07,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:52:07,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:52:07,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:12,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:13,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:52:18,335 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.31 vs. limit=15.0 2023-09-29 02:52:19,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:52:20,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:52:20,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:52:22,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 02:52:23,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:52:27,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:52:28,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:52:30,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:32,064 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.356e+02 2.885e+02 3.538e+02 5.665e+02, threshold=5.770e+02, percent-clipped=6.0 2023-09-29 02:52:37,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:38,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:42,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:44,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:47,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:52:47,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:49,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 02:52:49,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:52:49,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:51,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 02:52:53,916 INFO [train.py:1039] (0/4) Epoch 7, batch 2000, loss[loss=0.2057, simple_loss=0.284, pruned_loss=0.06365, over 24558.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2926, pruned_loss=0.08172, over 4721563.89 frames. ], batch size: 71, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:52:53,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:52:57,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:58,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:52:58,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:01,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:53:03,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:06,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 02:53:08,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:53:08,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=225880.0, ans=0.1 2023-09-29 02:53:10,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:53:13,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 02:53:13,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=225880.0, ans=0.125 2023-09-29 02:53:15,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:53:15,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:53:16,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=225880.0, ans=0.2 2023-09-29 02:53:17,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:53:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 02:53:22,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:25,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 02:53:25,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:53:28,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 02:53:28,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:31,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:53:31,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:53:31,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:33,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:34,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:36,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 02:53:38,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 02:53:38,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:38,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:53:43,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:45,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:53:45,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:47,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:49,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:49,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:50,071 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.79 vs. limit=15.0 2023-09-29 02:53:50,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:50,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:51,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=226013.33333333334, ans=0.125 2023-09-29 02:53:52,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:56,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:56,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 02:53:58,302 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.80 vs. limit=15.0 2023-09-29 02:54:01,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=226080.0, ans=0.125 2023-09-29 02:54:02,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:54:03,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:05,468 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=226080.0, ans=0.0 2023-09-29 02:54:05,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=226080.0, ans=0.125 2023-09-29 02:54:08,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:08,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:54:11,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:14,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:14,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:15,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:54:16,493 INFO [train.py:1039] (0/4) Epoch 7, batch 2050, loss[loss=0.2011, simple_loss=0.2653, pruned_loss=0.06843, over 18017.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2919, pruned_loss=0.08198, over 4714991.18 frames. ], batch size: 39, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 02:54:16,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:54:17,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=226146.66666666666, ans=0.2 2023-09-29 02:54:17,363 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.95 vs. limit=6.0 2023-09-29 02:54:19,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:19,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:19,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=226146.66666666666, ans=0.0 2023-09-29 02:54:23,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:23,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:30,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:54:31,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:54:31,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:33,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:54:35,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 02:54:35,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:54:36,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:54:38,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:54:43,115 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=226213.33333333334, ans=0.125 2023-09-29 02:54:47,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:47,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:51,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 02:54:51,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=226280.0, ans=0.1 2023-09-29 02:54:53,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:54,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 02:54:54,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:56,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:55:00,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:00,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:55:00,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:55:02,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:55:02,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:55:04,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:55:07,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:07,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=226346.66666666666, ans=0.125 2023-09-29 02:55:10,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:55:12,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:55:13,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=226346.66666666666, ans=0.035 2023-09-29 02:55:14,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:55:19,291 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.164e+02 2.389e+02 2.987e+02 5.025e+02, threshold=4.777e+02, percent-clipped=0.0 2023-09-29 02:55:19,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:26,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:55:28,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 02:55:33,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:33,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:55:33,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=226413.33333333334, ans=0.125 2023-09-29 02:55:36,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:55:38,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 02:55:40,154 INFO [train.py:1039] (0/4) Epoch 7, batch 2100, loss[loss=0.2352, simple_loss=0.2877, pruned_loss=0.09137, over 23778.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2903, pruned_loss=0.08155, over 4721763.95 frames. ], batch size: 164, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:55:41,904 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 02:55:41,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:41,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:42,652 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.53 vs. limit=15.0 2023-09-29 02:55:43,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:55:43,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:43,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 02:55:43,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 02:55:46,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:49,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:55:49,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:55:52,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:55,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:55:55,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 02:55:56,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:55:56,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 02:55:56,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 02:55:58,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:55:59,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:55:59,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 02:56:00,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:56:03,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=226546.66666666666, ans=10.0 2023-09-29 02:56:05,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 02:56:05,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:56:07,663 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=15.0 2023-09-29 02:56:08,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:56:08,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:56:08,526 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=226546.66666666666, ans=0.1 2023-09-29 02:56:13,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:56:13,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 02:56:14,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:14,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:56:14,976 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=226613.33333333334, ans=0.125 2023-09-29 02:56:15,483 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.60 vs. limit=6.0 2023-09-29 02:56:16,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=226613.33333333334, ans=0.1 2023-09-29 02:56:17,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 02:56:17,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:17,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 02:56:17,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 02:56:19,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 02:56:22,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:56:24,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:56:27,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:28,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:31,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:32,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:32,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 02:56:32,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:32,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:34,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:34,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 02:56:36,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 02:56:36,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 02:56:42,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:56:45,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:56:47,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 02:56:52,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:55,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:56:55,386 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=226746.66666666666, ans=0.125 2023-09-29 02:56:55,527 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=226746.66666666666, ans=0.2 2023-09-29 02:56:56,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:56:56,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:56:56,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:56:56,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:56:57,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=226746.66666666666, ans=0.125 2023-09-29 02:56:58,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:58,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:56:59,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:56:59,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:01,342 INFO [train.py:1039] (0/4) Epoch 7, batch 2150, loss[loss=0.2478, simple_loss=0.3027, pruned_loss=0.09648, over 23833.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.2892, pruned_loss=0.08077, over 4719764.11 frames. ], batch size: 195, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:57:02,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 02:57:04,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 02:57:04,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:04,872 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:57:07,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:57:07,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:57:08,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:57:08,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:57:12,330 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.36 vs. limit=22.5 2023-09-29 02:57:12,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:57:15,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:15,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:18,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:57:18,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:18,986 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=226880.0, ans=0.04949747468305833 2023-09-29 02:57:20,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:57:23,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:23,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:57:23,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:57:26,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:27,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 02:57:32,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:34,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:57:34,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=226946.66666666666, ans=0.125 2023-09-29 02:57:35,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:35,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:35,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:36,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:57:37,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:37,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:57:38,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:57:41,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 02:57:43,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:57:44,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:44,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:46,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:57:46,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:57:49,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:49,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:57:49,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=227013.33333333334, ans=0.1 2023-09-29 02:57:49,979 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=227013.33333333334, ans=0.125 2023-09-29 02:57:51,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:52,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 02:57:53,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:57:55,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=227013.33333333334, ans=0.04949747468305833 2023-09-29 02:57:56,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:56,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:58,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:58:01,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:58:01,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:01,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:01,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 02:58:04,434 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.076e+02 2.402e+02 2.714e+02 3.938e+02, threshold=4.805e+02, percent-clipped=0.0 2023-09-29 02:58:04,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 02:58:04,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:58:05,997 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 02:58:06,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:06,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:58:07,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 02:58:07,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:58:07,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 02:58:07,735 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 02:58:07,735 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 02:58:09,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 02:58:10,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:11,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=227080.0, ans=0.1 2023-09-29 02:58:12,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:58:12,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:58:12,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:13,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:58:15,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:15,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:18,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=227080.0, ans=0.0 2023-09-29 02:58:19,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=227080.0, ans=0.125 2023-09-29 02:58:22,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:58:24,650 INFO [train.py:1039] (0/4) Epoch 7, batch 2200, loss[loss=0.23, simple_loss=0.2959, pruned_loss=0.08212, over 23749.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.2896, pruned_loss=0.08076, over 4729382.72 frames. ], batch size: 150, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:58:24,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 02:58:28,860 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=227146.66666666666, ans=0.0 2023-09-29 02:58:29,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:58:30,366 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=227146.66666666666, ans=0.125 2023-09-29 02:58:34,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:35,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=227146.66666666666, ans=0.0 2023-09-29 02:58:36,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:58:36,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:58:37,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:58:39,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:39,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:58:39,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 02:58:44,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 02:58:44,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=227213.33333333334, ans=0.0 2023-09-29 02:58:45,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:58:52,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=227213.33333333334, ans=0.125 2023-09-29 02:58:53,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 02:58:58,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:58,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:58:58,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:59:03,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:59:03,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 02:59:05,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:59:08,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:08,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:59:11,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:59:11,848 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=227280.0, ans=0.125 2023-09-29 02:59:14,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:15,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:59:17,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:19,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 02:59:20,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:21,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 02:59:21,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=227346.66666666666, ans=0.1 2023-09-29 02:59:23,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:23,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:59:24,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:24,261 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=227346.66666666666, ans=0.2 2023-09-29 02:59:24,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=227346.66666666666, ans=0.07 2023-09-29 02:59:27,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:59:27,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:28,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:28,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:30,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:59:31,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:59:33,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:59:38,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:59:38,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:59:40,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:59:40,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=227413.33333333334, ans=0.1 2023-09-29 02:59:41,727 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 02:59:43,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:59:43,541 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 02:59:45,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:59:45,143 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 02:59:46,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:48,145 INFO [train.py:1039] (0/4) Epoch 7, batch 2250, loss[loss=0.2162, simple_loss=0.295, pruned_loss=0.06871, over 24312.00 frames. ], tot_loss[loss=0.2249, simple_loss=0.2896, pruned_loss=0.08008, over 4738807.91 frames. ], batch size: 74, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:59:48,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:59:48,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:50,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=227480.0, ans=0.125 2023-09-29 02:59:51,322 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 02:59:51,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:59:51,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=227480.0, ans=0.1 2023-09-29 02:59:54,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:02,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:00:04,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:00:08,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:10,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:11,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:13,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 03:00:13,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:13,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=227546.66666666666, ans=0.04949747468305833 2023-09-29 03:00:14,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:00:16,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 03:00:17,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:00:17,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:19,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:19,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=227613.33333333334, ans=0.125 2023-09-29 03:00:21,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=227613.33333333334, ans=0.2 2023-09-29 03:00:25,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:27,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:00:27,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:00:28,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 03:00:30,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:31,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:00:34,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=227613.33333333334, ans=0.1 2023-09-29 03:00:35,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:37,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:38,283 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.68 vs. limit=15.0 2023-09-29 03:00:39,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:00:39,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:41,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:41,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:00:47,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:00:50,692 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.075e+02 2.318e+02 2.667e+02 5.695e+02, threshold=4.636e+02, percent-clipped=1.0 2023-09-29 03:00:50,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:00:52,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=227746.66666666666, ans=0.0 2023-09-29 03:00:55,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:00:56,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:00:56,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:01:01,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:01:03,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:01:03,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 03:01:04,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:04,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:01:08,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 03:01:08,673 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=227813.33333333334, ans=0.0 2023-09-29 03:01:09,681 INFO [train.py:1039] (0/4) Epoch 7, batch 2300, loss[loss=0.3124, simple_loss=0.343, pruned_loss=0.1409, over 18958.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2914, pruned_loss=0.081, over 4732539.85 frames. ], batch size: 388, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:01:13,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:01:13,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:19,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:20,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:01:23,498 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 03:01:25,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:31,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:01:31,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:01:31,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:01:32,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:32,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 03:01:34,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:01:36,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:01:37,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:01:37,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=227880.0, ans=0.0 2023-09-29 03:01:40,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:01:44,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:01:49,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:01:51,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=227946.66666666666, ans=0.125 2023-09-29 03:01:56,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:01:56,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:58,655 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=228013.33333333334, ans=0.0 2023-09-29 03:01:59,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:02:01,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:04,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:02:06,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:02:06,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:02:06,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 03:02:09,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:02:09,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:11,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:11,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:02:11,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:12,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:02:12,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:02:12,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 03:02:12,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:02:12,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:14,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 03:02:21,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:02:25,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:02:30,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:31,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:02:31,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:02:32,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:02:32,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:02:33,870 INFO [train.py:1039] (0/4) Epoch 7, batch 2350, loss[loss=0.2412, simple_loss=0.3058, pruned_loss=0.08824, over 24081.00 frames. ], tot_loss[loss=0.2268, simple_loss=0.2919, pruned_loss=0.08085, over 4738019.49 frames. ], batch size: 86, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:02:34,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:02:35,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 03:02:41,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:02:41,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 03:02:43,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=228146.66666666666, ans=0.0 2023-09-29 03:02:48,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 03:02:51,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:54,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:02:54,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:54,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=228213.33333333334, ans=0.0 2023-09-29 03:02:57,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 03:02:58,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=228213.33333333334, ans=0.2 2023-09-29 03:03:01,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:03:04,840 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=228280.0, ans=0.0 2023-09-29 03:03:08,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 03:03:08,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=228280.0, ans=0.0 2023-09-29 03:03:09,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:03:12,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:03:12,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:03:12,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=228280.0, ans=0.1 2023-09-29 03:03:14,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:03:15,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 03:03:17,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:03:18,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:03:18,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:18,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:03:20,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=228346.66666666666, ans=0.1 2023-09-29 03:03:21,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:03:24,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=228346.66666666666, ans=0.125 2023-09-29 03:03:25,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 03:03:26,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:03:29,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:03:29,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:03:31,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 03:03:31,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:03:34,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 03:03:34,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:03:36,035 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.092e+02 2.385e+02 2.952e+02 4.935e+02, threshold=4.770e+02, percent-clipped=1.0 2023-09-29 03:03:39,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 03:03:44,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 03:03:45,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.60 vs. limit=15.0 2023-09-29 03:03:46,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:46,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:03:46,115 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 03:03:46,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 03:03:49,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 03:03:50,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:03:55,188 INFO [train.py:1039] (0/4) Epoch 7, batch 2400, loss[loss=0.2531, simple_loss=0.2973, pruned_loss=0.1044, over 23780.00 frames. ], tot_loss[loss=0.226, simple_loss=0.2911, pruned_loss=0.08041, over 4734334.67 frames. ], batch size: 179, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 03:03:57,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:03:59,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:04:02,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:04:03,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 03:04:05,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 03:04:12,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:04:12,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:16,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 03:04:16,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:04:17,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:17,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 03:04:24,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:24,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=228546.66666666666, ans=0.125 2023-09-29 03:04:27,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 03:04:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:04:34,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 03:04:38,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:04:40,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:44,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:04:46,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 03:04:46,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:04:52,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:04:54,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:04:56,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:57,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:04:57,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:04:59,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:05:00,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:05:00,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:00,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:05:00,817 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=15.80 vs. limit=15.0 2023-09-29 03:05:03,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:04,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:05:04,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 03:05:06,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 03:05:09,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:05:09,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:05:11,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 03:05:11,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 03:05:11,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 03:05:12,017 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 03:05:12,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 03:05:13,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:05:16,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:16,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:16,857 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 03:05:18,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:18,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:05:19,822 INFO [train.py:1039] (0/4) Epoch 7, batch 2450, loss[loss=0.2106, simple_loss=0.2924, pruned_loss=0.06438, over 24482.00 frames. ], tot_loss[loss=0.2253, simple_loss=0.2905, pruned_loss=0.08001, over 4726204.06 frames. ], batch size: 66, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:05:20,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=228813.33333333334, ans=0.1 2023-09-29 03:05:23,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:05:24,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:24,817 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=228813.33333333334, ans=0.1 2023-09-29 03:05:29,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:29,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:29,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 03:05:36,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:05:36,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:40,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:05:40,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:05:40,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:05:41,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 03:05:41,937 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.09 vs. limit=15.0 2023-09-29 03:05:46,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:47,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:05:49,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:51,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:05:51,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=228946.66666666666, ans=0.125 2023-09-29 03:05:52,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:52,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:53,531 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.44 vs. limit=10.0 2023-09-29 03:05:54,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:55,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 03:05:57,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:06:06,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:07,798 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:06:08,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:06:08,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:09,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:06:10,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:11,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:06:12,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 03:06:13,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=229013.33333333334, ans=0.125 2023-09-29 03:06:15,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:06:15,701 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.60 vs. limit=15.0 2023-09-29 03:06:17,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:06:20,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:06:20,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:23,659 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 2.159e+02 2.588e+02 3.094e+02 4.619e+02, threshold=5.175e+02, percent-clipped=0.0 2023-09-29 03:06:23,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:06:23,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 03:06:26,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:06:26,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:06:26,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 03:06:28,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:06:29,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:06:32,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:06:33,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=229080.0, ans=0.0 2023-09-29 03:06:36,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:36,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:06:39,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 03:06:41,317 INFO [train.py:1039] (0/4) Epoch 7, batch 2500, loss[loss=0.2137, simple_loss=0.2785, pruned_loss=0.07443, over 24618.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2899, pruned_loss=0.07981, over 4739035.47 frames. ], batch size: 60, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:06:41,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:06:49,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:06:59,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:07:01,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:07:01,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:07:01,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 03:07:09,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:07:09,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:10,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:07:10,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:07:12,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 03:07:13,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:15,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 03:07:15,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 03:07:15,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:16,127 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.99 vs. limit=15.0 2023-09-29 03:07:22,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:07:24,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:26,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=229280.0, ans=0.04949747468305833 2023-09-29 03:07:27,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:07:29,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 03:07:29,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:07:30,154 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=229346.66666666666, ans=0.2 2023-09-29 03:07:31,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:34,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:37,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:40,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:07:44,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:07:47,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 03:07:47,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:47,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:07:50,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:07:50,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:07:51,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=229413.33333333334, ans=0.1 2023-09-29 03:07:52,926 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 03:07:52,927 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 03:07:52,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 03:07:56,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:58,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 03:07:58,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 03:07:59,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:08:00,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 03:08:04,191 INFO [train.py:1039] (0/4) Epoch 7, batch 2550, loss[loss=0.2269, simple_loss=0.3023, pruned_loss=0.07573, over 24369.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2896, pruned_loss=0.08, over 4717716.58 frames. ], batch size: 77, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:08:04,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 03:08:04,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=229480.0, ans=0.125 2023-09-29 03:08:07,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:10,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:08:10,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:08:11,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:13,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 03:08:13,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:08:18,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 03:08:19,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:08:19,802 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=229546.66666666666, ans=0.0 2023-09-29 03:08:21,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:24,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:08:24,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 03:08:24,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:08:26,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:26,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:29,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:08:31,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 03:08:31,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:08:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:31,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 03:08:44,450 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.64 vs. limit=22.5 2023-09-29 03:08:45,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:08:49,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:08:51,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:51,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:51,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:08:58,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:59,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:09:01,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:09:01,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:09:02,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:09:02,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:09:06,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:06,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:07,767 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.133e+02 2.441e+02 2.869e+02 4.948e+02, threshold=4.883e+02, percent-clipped=0.0 2023-09-29 03:09:11,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:09:11,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 03:09:11,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:09:13,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:13,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:09:15,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:09:17,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:18,034 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.64 vs. limit=10.0 2023-09-29 03:09:23,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:09:26,268 INFO [train.py:1039] (0/4) Epoch 7, batch 2600, loss[loss=0.2441, simple_loss=0.3153, pruned_loss=0.0865, over 24459.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.2903, pruned_loss=0.08024, over 4725233.21 frames. ], batch size: 69, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:09:26,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:28,493 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 03:09:31,562 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 03:09:31,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:09:31,640 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 03:09:33,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 03:09:33,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=229813.33333333334, ans=0.125 2023-09-29 03:09:34,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=229813.33333333334, ans=10.0 2023-09-29 03:09:34,569 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 03:09:37,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:37,714 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 03:09:39,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 03:09:42,774 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 03:09:42,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:09:45,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 03:09:45,529 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:09:46,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 03:09:49,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:09:49,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 03:09:53,330 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 03:09:53,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 03:09:55,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=229880.0, ans=0.035 2023-09-29 03:09:59,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:09:59,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:01,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:01,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 03:10:03,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:10:08,056 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 03:10:11,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=229946.66666666666, ans=0.09899494936611666 2023-09-29 03:10:14,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:16,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:17,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 03:10:19,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:19,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:19,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 03:10:22,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:10:22,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:10:24,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:29,534 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 03:10:29,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:30,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:10:32,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=230080.0, ans=0.1 2023-09-29 03:10:35,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:37,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:10:37,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 03:10:37,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:39,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=230080.0, ans=0.125 2023-09-29 03:10:40,192 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=8.09 vs. limit=15.0 2023-09-29 03:10:40,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:10:40,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=230080.0, ans=0.0 2023-09-29 03:10:42,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:10:46,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 03:10:48,337 INFO [train.py:1039] (0/4) Epoch 7, batch 2650, loss[loss=0.2268, simple_loss=0.309, pruned_loss=0.07234, over 24317.00 frames. ], tot_loss[loss=0.2269, simple_loss=0.2919, pruned_loss=0.0809, over 4734191.10 frames. ], batch size: 74, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:10:48,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:50,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:10:54,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 03:10:54,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:55,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:10:57,363 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 03:10:57,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:10:59,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:03,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:11:05,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:11:08,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:11:09,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 03:11:09,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:11:09,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:11:10,127 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=230213.33333333334, ans=0.0 2023-09-29 03:11:12,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=230213.33333333334, ans=0.1 2023-09-29 03:11:13,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 03:11:15,147 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 03:11:18,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:18,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 03:11:19,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:21,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 03:11:25,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=230280.0, ans=0.07 2023-09-29 03:11:26,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:11:26,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:31,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 03:11:31,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 03:11:35,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:11:35,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=230280.0, ans=0.0 2023-09-29 03:11:38,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 03:11:39,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:41,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:41,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:11:41,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:41,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:44,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:45,580 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.99 vs. limit=8.0 2023-09-29 03:11:46,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:48,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:48,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:11:49,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:11:51,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:51,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:11:51,661 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=230346.66666666666, ans=0.0 2023-09-29 03:11:52,731 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.070e+02 2.281e+02 2.771e+02 4.083e+02, threshold=4.562e+02, percent-clipped=0.0 2023-09-29 03:11:52,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:54,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:54,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:11:58,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:00,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:12:00,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:00,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 03:12:00,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=230413.33333333334, ans=0.125 2023-09-29 03:12:05,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:07,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:07,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:07,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:08,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:12:10,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:11,473 INFO [train.py:1039] (0/4) Epoch 7, batch 2700, loss[loss=0.2205, simple_loss=0.2949, pruned_loss=0.07308, over 24490.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.2922, pruned_loss=0.08122, over 4739490.60 frames. ], batch size: 63, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:12:13,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:13,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 03:12:14,769 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=230480.0, ans=0.0 2023-09-29 03:12:16,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:12:18,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:12:21,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:12:21,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:21,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:22,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:12:22,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:22,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:12:22,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:12:24,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 03:12:24,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:12:27,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:12:28,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:12:28,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:30,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=230546.66666666666, ans=0.125 2023-09-29 03:12:32,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:12:32,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 03:12:33,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:12:34,247 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.99 vs. limit=12.0 2023-09-29 03:12:37,390 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=230546.66666666666, ans=0.0 2023-09-29 03:12:42,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:12:42,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:12:47,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:12:47,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:47,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:12:48,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:12:51,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:12:54,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:54,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:12:54,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:12:58,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:58,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:13:08,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:13:08,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:13,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:13:13,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:16,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:17,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:19,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:13:20,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:22,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:22,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:25,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:13:26,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:26,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:30,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 03:13:30,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:32,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:13:32,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 03:13:34,015 INFO [train.py:1039] (0/4) Epoch 7, batch 2750, loss[loss=0.2361, simple_loss=0.306, pruned_loss=0.08306, over 24355.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.2918, pruned_loss=0.08127, over 4724932.72 frames. ], batch size: 77, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:13:34,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 03:13:34,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:38,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:13:38,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:39,148 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:13:41,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:41,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:13:42,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:45,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:13:47,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:13:47,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:13:47,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:47,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 03:13:47,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:13:47,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:54,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 03:13:57,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:57,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:57,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:57,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:13:57,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=230880.0, ans=0.1 2023-09-29 03:13:58,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:00,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:14:00,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=230880.0, ans=0.0 2023-09-29 03:14:01,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:01,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:08,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:14:08,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:14:08,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:14:09,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:10,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:14:16,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:18,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:14:18,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:23,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:23,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:14:25,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:14:31,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:14:31,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:14:31,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 03:14:36,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:36,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 03:14:38,314 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 2.012e+02 2.382e+02 2.660e+02 4.649e+02, threshold=4.763e+02, percent-clipped=1.0 2023-09-29 03:14:41,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:14:44,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:14:44,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 03:14:45,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=231080.0, ans=0.125 2023-09-29 03:14:46,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:14:50,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:14:50,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 03:14:50,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:14:54,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:14:55,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:14:55,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:14:55,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 03:14:55,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:57,120 INFO [train.py:1039] (0/4) Epoch 7, batch 2800, loss[loss=0.2301, simple_loss=0.2832, pruned_loss=0.08848, over 23781.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2904, pruned_loss=0.08136, over 4707799.93 frames. ], batch size: 179, lr: 1.47e-02, grad_scale: 32.0 2023-09-29 03:14:57,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:00,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:00,178 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 03:15:00,179 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 03:15:03,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:04,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:15:06,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:15:09,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:15:12,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 03:15:15,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:15:17,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 03:15:17,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:19,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:15:19,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:20,796 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=231213.33333333334, ans=0.035 2023-09-29 03:15:23,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:23,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:23,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:15:24,591 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.74 vs. limit=15.0 2023-09-29 03:15:25,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:15:34,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:15:37,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:40,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:40,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:15:42,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:48,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:15:48,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 03:15:48,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:50,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:50,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:15:53,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:55,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:59,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:16:01,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:16:04,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:16:04,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:16:04,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:16:05,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:16:05,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 03:16:05,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:16:08,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 03:16:10,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:10,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:16:10,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:16:11,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 03:16:12,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=231413.33333333334, ans=0.0 2023-09-29 03:16:17,963 INFO [train.py:1039] (0/4) Epoch 7, batch 2850, loss[loss=0.221, simple_loss=0.2778, pruned_loss=0.0821, over 23675.00 frames. ], tot_loss[loss=0.2251, simple_loss=0.2893, pruned_loss=0.08048, over 4704904.07 frames. ], batch size: 135, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:16:18,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:16:18,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:16:19,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:16:21,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:25,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:16:25,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:16:25,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:16:29,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:29,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:31,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:16:31,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 03:16:37,920 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.49 vs. limit=12.0 2023-09-29 03:16:38,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 03:16:38,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:40,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 03:16:41,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:46,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 03:16:46,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 03:16:47,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:47,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=231546.66666666666, ans=0.125 2023-09-29 03:17:00,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:00,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:00,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:17:02,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:17:02,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:17:02,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:17:04,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:17:04,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 03:17:06,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:17:07,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:07,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:10,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:12,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:16,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:19,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:17:19,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:21,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:21,801 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.36 vs. limit=15.0 2023-09-29 03:17:24,053 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.041e+02 2.225e+02 2.602e+02 4.724e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 03:17:24,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:17:28,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:17:30,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 03:17:30,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 03:17:31,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:17:33,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:33,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 03:17:35,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:17:35,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:35,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:35,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:17:35,484 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 03:17:37,475 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 03:17:37,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:17:37,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:40,488 INFO [train.py:1039] (0/4) Epoch 7, batch 2900, loss[loss=0.2452, simple_loss=0.3182, pruned_loss=0.08605, over 24007.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.2896, pruned_loss=0.08078, over 4710078.16 frames. ], batch size: 86, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:17:40,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=231813.33333333334, ans=0.125 2023-09-29 03:17:42,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:17:42,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:42,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:44,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 03:17:49,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:49,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 03:17:49,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 03:17:51,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:17:51,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:17:54,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:55,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:18:00,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:18:00,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:18:00,775 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=231880.0, ans=0.125 2023-09-29 03:18:02,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:18:03,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 03:18:03,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:18:05,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:07,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 03:18:09,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 03:18:11,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:18:11,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 03:18:12,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:18:14,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:18:14,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:18:16,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=231946.66666666666, ans=0.125 2023-09-29 03:18:17,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:18:18,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:24,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:18:26,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:27,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 03:18:27,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 03:18:27,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:18:32,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:18:34,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 03:18:35,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:18:42,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:51,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:18:51,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:18:52,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 03:18:56,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:56,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 03:18:56,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:18:58,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:19:01,809 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.62 vs. limit=10.0 2023-09-29 03:19:02,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:19:04,276 INFO [train.py:1039] (0/4) Epoch 7, batch 2950, loss[loss=0.3366, simple_loss=0.3651, pruned_loss=0.1541, over 19161.00 frames. ], tot_loss[loss=0.2268, simple_loss=0.291, pruned_loss=0.08131, over 4707270.19 frames. ], batch size: 388, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:19:04,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 03:19:05,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:05,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:07,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:08,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:19:10,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 03:19:11,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 03:19:12,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:19:12,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:19,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:22,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:23,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:19:25,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:27,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=232213.33333333334, ans=0.125 2023-09-29 03:19:29,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:19:29,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:19:30,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:19:32,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=232213.33333333334, ans=0.5 2023-09-29 03:19:35,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 03:19:39,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 03:19:40,679 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 03:19:40,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:19:41,025 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=232280.0, ans=0.125 2023-09-29 03:19:41,632 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.27 vs. limit=6.0 2023-09-29 03:19:42,332 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 03:19:43,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 03:19:43,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:45,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:45,753 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 03:19:45,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:19:49,677 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.43 vs. limit=22.5 2023-09-29 03:19:50,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 03:19:50,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:51,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:19:54,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:57,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:19:58,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:19:58,554 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 03:19:58,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:58,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 03:20:00,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=232346.66666666666, ans=0.2 2023-09-29 03:20:02,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=232346.66666666666, ans=0.125 2023-09-29 03:20:06,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:08,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:10,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 03:20:10,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:20:11,822 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.080e+02 2.429e+02 2.872e+02 4.397e+02, threshold=4.858e+02, percent-clipped=0.0 2023-09-29 03:20:12,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 03:20:15,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:15,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=232413.33333333334, ans=0.125 2023-09-29 03:20:16,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:20:16,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:20:18,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:19,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:20:21,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:20:23,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:23,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:20:23,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:20:23,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:24,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:20:26,246 INFO [train.py:1039] (0/4) Epoch 7, batch 3000, loss[loss=0.2054, simple_loss=0.2765, pruned_loss=0.06712, over 24309.00 frames. ], tot_loss[loss=0.2263, simple_loss=0.2917, pruned_loss=0.08042, over 4724102.95 frames. ], batch size: 56, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:20:26,247 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 03:20:40,692 INFO [train.py:1071] (0/4) Epoch 7, validation: loss=0.3621, simple_loss=0.3045, pruned_loss=0.2099, over 1125622.00 frames. 2023-09-29 03:20:40,694 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 03:20:40,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:40,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 03:20:42,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:45,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:20:47,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:20:50,563 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 03:20:50,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 03:20:52,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:54,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:20:54,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 03:20:54,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:20:57,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=232546.66666666666, ans=0.0 2023-09-29 03:21:01,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:21:09,070 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.24 vs. limit=15.0 2023-09-29 03:21:10,838 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.81 vs. limit=10.0 2023-09-29 03:21:11,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:21:18,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 03:21:18,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:21:21,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:21:23,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:21:23,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:21:24,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:24,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 03:21:28,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 03:21:30,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:21:30,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:21:31,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:21:31,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:33,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:33,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:21:35,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=232680.0, ans=0.1 2023-09-29 03:21:36,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:21:38,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:38,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:21:39,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:40,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=232680.0, ans=0.125 2023-09-29 03:21:42,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 03:21:43,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:21:43,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:21:43,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:21:46,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:46,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:48,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:21:48,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 03:21:48,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:21:48,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=232746.66666666666, ans=0.0 2023-09-29 03:21:50,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 03:21:50,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:21:51,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 03:21:51,962 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=232746.66666666666, ans=0.125 2023-09-29 03:21:55,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:21:57,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:21:57,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 03:21:57,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 03:21:57,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:21:58,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:22:00,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:22:01,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:22:01,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:01,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:22:03,390 INFO [train.py:1039] (0/4) Epoch 7, batch 3050, loss[loss=0.2512, simple_loss=0.3159, pruned_loss=0.09328, over 23140.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.2921, pruned_loss=0.08118, over 4726339.96 frames. ], batch size: 105, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:22:05,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 03:22:08,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:11,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:11,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:22:14,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:17,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 03:22:24,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 03:22:25,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 03:22:25,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:31,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:22:35,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:36,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:39,131 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.94 vs. limit=22.5 2023-09-29 03:22:39,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:22:39,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:22:39,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:41,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:41,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:42,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:44,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:44,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=232946.66666666666, ans=0.0 2023-09-29 03:22:45,039 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-09-29 03:22:46,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:46,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 03:22:48,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:48,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:22:52,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:53,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:22:53,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:22:54,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:22:58,253 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=233013.33333333334, ans=0.0 2023-09-29 03:23:00,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:23:00,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:01,667 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.82 vs. limit=10.0 2023-09-29 03:23:07,899 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.00 vs. limit=15.0 2023-09-29 03:23:09,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:09,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:09,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:23:11,240 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.044e+02 2.286e+02 2.664e+02 4.744e+02, threshold=4.572e+02, percent-clipped=0.0 2023-09-29 03:23:11,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:11,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:23:12,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:23:14,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 03:23:14,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:14,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=233080.0, ans=0.125 2023-09-29 03:23:15,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:17,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 03:23:20,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:22,839 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:23:25,320 INFO [train.py:1039] (0/4) Epoch 7, batch 3100, loss[loss=0.2248, simple_loss=0.2959, pruned_loss=0.07683, over 23715.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2924, pruned_loss=0.0805, over 4735729.59 frames. ], batch size: 85, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:23:25,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=233146.66666666666, ans=0.1 2023-09-29 03:23:26,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:27,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:23:30,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:23:31,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 03:23:35,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 03:23:36,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 03:23:38,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:23:43,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:23:43,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:45,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:23:48,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:53,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 03:23:58,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:23:59,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:59,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:23:59,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:01,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:24:04,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:24:04,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 03:24:04,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:24:05,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:07,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 03:24:10,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:24:13,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:24:14,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 03:24:16,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 03:24:18,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:18,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:21,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:21,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:21,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:24:23,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:24:23,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:24:24,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:24:26,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:24:26,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:26,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:24:29,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:24:29,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 03:24:32,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:24:33,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 03:24:33,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:33,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:35,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 03:24:47,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 03:24:48,380 INFO [train.py:1039] (0/4) Epoch 7, batch 3150, loss[loss=0.2146, simple_loss=0.2847, pruned_loss=0.07223, over 24473.00 frames. ], tot_loss[loss=0.2257, simple_loss=0.2909, pruned_loss=0.08028, over 4733762.82 frames. ], batch size: 63, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:24:50,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:50,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:51,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:51,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:24:53,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 03:24:55,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:55,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:24:57,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 03:24:58,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:00,783 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 03:25:01,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=233480.0, ans=0.0 2023-09-29 03:25:03,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 03:25:04,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:04,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=233546.66666666666, ans=0.0 2023-09-29 03:25:05,446 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 03:25:06,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:25:08,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 03:25:08,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 03:25:08,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 03:25:08,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:08,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:08,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:10,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 03:25:13,321 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.10 vs. limit=10.0 2023-09-29 03:25:13,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:14,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:15,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:17,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:25:22,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 03:25:23,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:25:26,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:25:27,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:28,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 03:25:31,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 03:25:32,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:25:32,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:25:32,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:25:33,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:33,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:25:35,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:25:35,953 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.49 vs. limit=10.0 2023-09-29 03:25:36,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:25:38,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 03:25:39,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:25:39,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:41,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:25:41,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:42,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 03:25:42,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:44,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 03:25:44,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:44,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 03:25:46,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 03:25:48,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:25:49,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:51,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 03:25:53,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:25:53,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:56,384 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 2.159e+02 2.421e+02 2.808e+02 3.931e+02, threshold=4.841e+02, percent-clipped=0.0 2023-09-29 03:25:56,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:58,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:58,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:25:58,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=233746.66666666666, ans=0.125 2023-09-29 03:26:04,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:26:04,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:07,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 03:26:10,211 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=233813.33333333334, ans=0.0 2023-09-29 03:26:11,313 INFO [train.py:1039] (0/4) Epoch 7, batch 3200, loss[loss=0.1917, simple_loss=0.2581, pruned_loss=0.06262, over 24296.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2899, pruned_loss=0.07979, over 4743773.48 frames. ], batch size: 56, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:26:14,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:26:14,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:26:18,352 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=15.75 vs. limit=15.0 2023-09-29 03:26:18,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:20,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:26:20,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 03:26:23,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:26:27,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:26:30,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:35,296 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=233880.0, ans=0.125 2023-09-29 03:26:40,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:26:40,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=233880.0, ans=0.2 2023-09-29 03:26:51,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 03:26:51,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:26:51,433 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=233946.66666666666, ans=0.125 2023-09-29 03:26:54,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 03:26:56,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:27:01,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:27:01,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:27:01,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:27:04,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 03:27:06,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:27:07,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 03:27:10,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 03:27:12,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:27:15,016 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=234080.0, ans=0.125 2023-09-29 03:27:19,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:19,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:27:19,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:20,885 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 03:27:20,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:27:25,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:27,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 03:27:28,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 03:27:30,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 03:27:32,182 INFO [train.py:1039] (0/4) Epoch 7, batch 3250, loss[loss=0.2219, simple_loss=0.302, pruned_loss=0.07093, over 23738.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.2895, pruned_loss=0.07936, over 4742827.24 frames. ], batch size: 85, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:27:32,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 03:27:32,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=234146.66666666666, ans=0.0 2023-09-29 03:27:33,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:27:37,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:27:37,037 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 03:27:38,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:27:38,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:27:40,042 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 03:27:43,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:27:46,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:27:52,475 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.41 vs. limit=22.5 2023-09-29 03:27:53,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:27:53,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 03:27:53,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:54,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:54,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:27:56,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:27:56,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:28:00,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:00,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:28:00,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=234213.33333333334, ans=0.2 2023-09-29 03:28:02,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:02,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:04,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:07,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:28:09,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:10,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:12,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:12,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:28:12,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:17,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 03:28:17,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:28:17,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:28:18,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:18,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:28:19,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=234346.66666666666, ans=0.125 2023-09-29 03:28:25,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:28:35,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:35,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:35,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 03:28:35,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:28:35,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:28:37,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:38,673 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.002e+02 2.212e+02 2.720e+02 4.684e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 03:28:38,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 03:28:38,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 03:28:38,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:41,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:42,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:44,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:28:44,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:48,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:48,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:28:50,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 03:28:50,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:28:52,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:28:52,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 03:28:53,353 INFO [train.py:1039] (0/4) Epoch 7, batch 3300, loss[loss=0.2915, simple_loss=0.3313, pruned_loss=0.1258, over 19706.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2897, pruned_loss=0.07973, over 4726885.09 frames. ], batch size: 388, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:28:55,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:55,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 03:28:58,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 03:29:00,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 03:29:00,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:03,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:29:05,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:29:06,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:06,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:29:08,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:29:10,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:12,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:29:16,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 03:29:18,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:19,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:20,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:20,519 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 03:29:20,643 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=234546.66666666666, ans=0.125 2023-09-29 03:29:22,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=234546.66666666666, ans=0.125 2023-09-29 03:29:23,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:29:23,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:29:23,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:29:23,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:29:23,616 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 03:29:23,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=234546.66666666666, ans=0.125 2023-09-29 03:29:28,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:28,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:29:30,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:30,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 03:29:31,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 03:29:31,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:34,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:29:35,030 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 03:29:39,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 03:29:39,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:29:42,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 03:29:45,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:29:48,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:29:48,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:29:53,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:53,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:53,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:54,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:29:56,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:29:58,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:58,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:29:59,987 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 03:30:00,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 03:30:03,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:30:03,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=234746.66666666666, ans=0.125 2023-09-29 03:30:04,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:04,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:06,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:30:06,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:07,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:30:07,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:07,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:30:09,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:30:11,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:30:15,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 03:30:15,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:16,653 INFO [train.py:1039] (0/4) Epoch 7, batch 3350, loss[loss=0.1978, simple_loss=0.2574, pruned_loss=0.06908, over 24303.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.2903, pruned_loss=0.0804, over 4735954.63 frames. ], batch size: 56, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:30:16,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:18,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:30:18,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:30:20,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:22,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=234813.33333333334, ans=0.125 2023-09-29 03:30:23,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:23,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:26,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:30:28,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:29,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:30:31,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:34,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:30:34,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:36,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:30:36,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 03:30:38,056 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 03:30:39,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:43,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 03:30:43,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 03:30:46,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:30:46,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:30:46,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:47,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 03:30:47,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:47,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:30:49,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:51,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:52,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:52,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:30:56,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:30:59,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:59,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:02,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:31:04,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:07,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:07,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:10,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:12,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 03:31:12,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:31:12,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 03:31:14,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:31:15,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 03:31:17,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:17,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:23,272 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.937e+02 2.206e+02 2.498e+02 3.654e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 03:31:27,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:27,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 03:31:28,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:28,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:31:30,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:31:35,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:38,482 INFO [train.py:1039] (0/4) Epoch 7, batch 3400, loss[loss=0.3228, simple_loss=0.3527, pruned_loss=0.1464, over 19396.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.291, pruned_loss=0.08098, over 4719560.64 frames. ], batch size: 388, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:31:38,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 03:31:38,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:31:38,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:31:40,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:42,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 03:31:43,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:43,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 03:31:45,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:31:48,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:31:48,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 03:31:49,767 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-09-29 03:31:52,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 03:31:52,040 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 03:31:52,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:56,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:56,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:57,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:31:58,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:32:03,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:05,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 03:32:12,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:32:14,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:15,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:17,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:32:22,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:32:25,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 03:32:26,000 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.66 vs. limit=15.0 2023-09-29 03:32:30,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 03:32:31,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:32:33,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:33,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:35,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:32:38,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:43,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:32:43,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:32:43,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=235413.33333333334, ans=0.0 2023-09-29 03:32:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:32:52,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 03:32:57,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:32:57,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=235413.33333333334, ans=0.125 2023-09-29 03:33:00,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 03:33:01,930 INFO [train.py:1039] (0/4) Epoch 7, batch 3450, loss[loss=0.23, simple_loss=0.2841, pruned_loss=0.08798, over 23707.00 frames. ], tot_loss[loss=0.2274, simple_loss=0.2913, pruned_loss=0.08172, over 4713621.43 frames. ], batch size: 232, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:33:06,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 03:33:08,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:09,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:33:09,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 03:33:10,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:33:13,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:33:17,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=235546.66666666666, ans=10.0 2023-09-29 03:33:18,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:33:18,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:20,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:33:20,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:23,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:30,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 03:33:30,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=235546.66666666666, ans=0.125 2023-09-29 03:33:33,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=235613.33333333334, ans=0.0 2023-09-29 03:33:35,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 03:33:35,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:33:35,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:33:38,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:42,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 03:33:43,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:33:48,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:33:50,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:50,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:33:51,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:33:54,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 03:33:54,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:33:57,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:34:00,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:03,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 03:34:06,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:34:09,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.982e+02 2.278e+02 2.656e+02 4.314e+02, threshold=4.555e+02, percent-clipped=0.0 2023-09-29 03:34:12,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:34:13,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:15,674 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.98 vs. limit=15.0 2023-09-29 03:34:16,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:21,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:21,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:34:21,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:34:23,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:34:24,938 INFO [train.py:1039] (0/4) Epoch 7, batch 3500, loss[loss=0.1999, simple_loss=0.2749, pruned_loss=0.06248, over 24378.00 frames. ], tot_loss[loss=0.227, simple_loss=0.2903, pruned_loss=0.08184, over 4700466.53 frames. ], batch size: 61, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:34:26,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:30,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:34:30,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 03:34:32,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:34:35,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:34:37,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:37,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 03:34:37,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=235813.33333333334, ans=0.0 2023-09-29 03:34:42,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:34:44,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:44,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:34:44,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:34:45,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:34:47,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:47,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:34:47,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 03:34:49,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:50,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:34:51,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=235880.0, ans=0.0 2023-09-29 03:34:52,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:34:56,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:57,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 03:34:57,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:35:01,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:35:04,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:35:05,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:07,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:35:09,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:12,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 03:35:12,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 03:35:14,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 03:35:14,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:15,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:16,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:16,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=236013.33333333334, ans=0.125 2023-09-29 03:35:17,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:35:18,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=236013.33333333334, ans=0.0 2023-09-29 03:35:21,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:35:21,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:35:23,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=236013.33333333334, ans=0.125 2023-09-29 03:35:27,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:35:28,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 03:35:28,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 03:35:28,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:35:32,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:32,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:35,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:38,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 03:35:38,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:42,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:42,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 03:35:44,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 03:35:47,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:47,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:47,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:35:47,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:35:49,014 INFO [train.py:1039] (0/4) Epoch 7, batch 3550, loss[loss=0.2302, simple_loss=0.3064, pruned_loss=0.07695, over 24634.00 frames. ], tot_loss[loss=0.2261, simple_loss=0.2892, pruned_loss=0.08146, over 4700422.95 frames. ], batch size: 68, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:35:50,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:35:52,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=236146.66666666666, ans=0.0 2023-09-29 03:36:00,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:03,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:36:06,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:07,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=236213.33333333334, ans=0.0 2023-09-29 03:36:08,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:36:08,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:10,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:36:10,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:36:14,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:15,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:36:15,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:15,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:36:15,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:36:22,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:36:22,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:23,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:23,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:23,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:36:23,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 03:36:25,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:25,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=236280.0, ans=0.125 2023-09-29 03:36:26,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:28,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:36:31,086 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.68 vs. limit=10.0 2023-09-29 03:36:34,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:34,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:36,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:38,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 03:36:40,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:36:41,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 03:36:41,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:41,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=236346.66666666666, ans=0.125 2023-09-29 03:36:43,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:36:43,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:36:46,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 03:36:49,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=236346.66666666666, ans=0.5 2023-09-29 03:36:50,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:55,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:56,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 03:36:56,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:36:57,955 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.050e+02 2.243e+02 2.592e+02 3.943e+02, threshold=4.485e+02, percent-clipped=0.0 2023-09-29 03:36:59,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:37:01,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 03:37:02,573 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-09-29 03:37:07,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 03:37:07,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:07,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:37:10,911 INFO [train.py:1039] (0/4) Epoch 7, batch 3600, loss[loss=0.1916, simple_loss=0.2724, pruned_loss=0.05537, over 24656.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.2892, pruned_loss=0.08099, over 4698360.42 frames. ], batch size: 65, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:37:10,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:11,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:12,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:37:16,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:18,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:18,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:37:20,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:37:21,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:21,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 03:37:25,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:37:26,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:29,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:32,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:33,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:37:35,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:35,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 03:37:35,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:38,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:38,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:37:41,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:37:42,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:42,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:37:45,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 03:37:47,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=236613.33333333334, ans=10.0 2023-09-29 03:37:51,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:54,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:37:54,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 03:37:57,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:38:02,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:04,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=236680.0, ans=0.125 2023-09-29 03:38:05,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:11,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:38:11,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:38:12,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 03:38:13,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 03:38:13,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 03:38:13,775 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=236680.0, ans=0.0 2023-09-29 03:38:15,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:38:17,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:38:18,142 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=236746.66666666666, ans=15.0 2023-09-29 03:38:18,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 03:38:20,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:20,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:38:20,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:22,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 03:38:22,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=236746.66666666666, ans=0.125 2023-09-29 03:38:23,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 03:38:27,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:28,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 03:38:34,832 INFO [train.py:1039] (0/4) Epoch 7, batch 3650, loss[loss=0.2155, simple_loss=0.2841, pruned_loss=0.07349, over 24678.00 frames. ], tot_loss[loss=0.226, simple_loss=0.2897, pruned_loss=0.08114, over 4701242.74 frames. ], batch size: 65, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:38:34,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 03:38:36,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:38:39,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 03:38:42,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 03:38:46,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:38:46,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:38:47,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:38:51,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:38:51,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:51,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 03:38:51,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:38:53,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:53,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 03:38:54,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:38:55,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:38:57,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:38:57,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=236880.0, ans=0.1 2023-09-29 03:38:58,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:39:00,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 03:39:02,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 03:39:04,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:39:05,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 03:39:05,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:05,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:39:13,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:39:16,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:16,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:39:17,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:39:19,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:39:21,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:39:24,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:26,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:26,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:28,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:39:28,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:30,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:36,513 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 03:39:41,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:39:41,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:43,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:39:43,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:44,550 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.001e+02 2.357e+02 2.750e+02 4.366e+02, threshold=4.713e+02, percent-clipped=0.0 2023-09-29 03:39:44,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:39:46,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:47,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 03:39:47,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:51,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:39:52,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:52,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:39:56,980 INFO [train.py:1039] (0/4) Epoch 7, batch 3700, loss[loss=0.2195, simple_loss=0.2927, pruned_loss=0.07313, over 24331.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.2902, pruned_loss=0.08081, over 4725446.15 frames. ], batch size: 74, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:39:57,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:57,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 03:39:57,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:57,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:39:58,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:40:02,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:40:04,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:06,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:07,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:40:07,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:40:09,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:40:10,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:12,745 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 03:40:20,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:40:21,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:40:23,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:40:23,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 03:40:23,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:29,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:29,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=237280.0, ans=0.125 2023-09-29 03:40:29,873 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.64 vs. limit=6.0 2023-09-29 03:40:30,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 03:40:30,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:32,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:40:35,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:35,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:40:37,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:40:41,128 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=237280.0, ans=0.0 2023-09-29 03:40:41,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=237280.0, ans=0.125 2023-09-29 03:40:42,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:42,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 03:40:42,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:43,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 03:40:47,936 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.23 vs. limit=6.0 2023-09-29 03:40:50,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:40:50,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:40:53,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:55,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 03:40:58,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:40:58,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:40:58,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:40:58,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:41:01,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:41:02,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 03:41:02,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 03:41:04,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:41:04,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:06,504 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=237413.33333333334, ans=0.125 2023-09-29 03:41:07,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:41:07,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:41:11,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:41:15,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:41:17,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:41:18,613 INFO [train.py:1039] (0/4) Epoch 7, batch 3750, loss[loss=0.229, simple_loss=0.2997, pruned_loss=0.07913, over 23296.00 frames. ], tot_loss[loss=0.2261, simple_loss=0.2907, pruned_loss=0.08072, over 4736875.52 frames. ], batch size: 93, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:41:18,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 03:41:20,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:41:23,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:41:23,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 03:41:25,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:41:27,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:27,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:28,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=237480.0, ans=0.0 2023-09-29 03:41:30,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:41:33,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:35,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=237546.66666666666, ans=0.125 2023-09-29 03:41:37,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:41:39,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:41:40,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:41:44,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:41:45,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 03:41:45,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=237546.66666666666, ans=0.2 2023-09-29 03:41:47,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:41:48,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:49,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:53,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=237613.33333333334, ans=0.125 2023-09-29 03:41:54,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 03:41:56,931 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.65 vs. limit=15.0 2023-09-29 03:41:57,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 03:41:59,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:59,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:42:01,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:06,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:07,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:42:08,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=237680.0, ans=0.2 2023-09-29 03:42:10,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 03:42:11,573 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.65 vs. limit=12.0 2023-09-29 03:42:13,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:17,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:42:18,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:42:18,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=237680.0, ans=0.1 2023-09-29 03:42:20,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:42:27,705 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.086e+02 2.277e+02 2.555e+02 3.671e+02, threshold=4.554e+02, percent-clipped=0.0 2023-09-29 03:42:27,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:42:29,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:42:32,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:42:32,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:42:34,283 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=237746.66666666666, ans=0.125 2023-09-29 03:42:35,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:42:40,471 INFO [train.py:1039] (0/4) Epoch 7, batch 3800, loss[loss=0.2299, simple_loss=0.2894, pruned_loss=0.08524, over 23500.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2916, pruned_loss=0.08074, over 4735058.24 frames. ], batch size: 134, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:42:45,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:42:49,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:49,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:42:51,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 03:42:53,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:55,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:42:55,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:42:56,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 03:42:56,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:58,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:43:01,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:43:01,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:43:01,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:03,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 03:43:06,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 03:43:08,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:43:09,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:13,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:43:14,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:43:16,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:43:17,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:19,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:20,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:22,472 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=237946.66666666666, ans=0.0 2023-09-29 03:43:26,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:43:26,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 03:43:27,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:32,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:43:41,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:43:44,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 03:43:46,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 03:43:48,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:49,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:51,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:52,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 03:43:56,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 03:43:56,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 03:43:56,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:57,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:44:03,083 INFO [train.py:1039] (0/4) Epoch 7, batch 3850, loss[loss=0.2068, simple_loss=0.2417, pruned_loss=0.08596, over 19556.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.2908, pruned_loss=0.08037, over 4726927.52 frames. ], batch size: 388, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:44:03,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:44:04,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:44:11,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:44:11,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 03:44:13,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:44:15,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:18,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:44:20,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:23,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:44:24,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 03:44:29,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:31,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:34,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:34,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:44:39,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:39,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:44:40,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:40,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:44:41,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:44,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:44:46,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 03:44:46,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 03:44:48,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:48,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:52,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 03:44:56,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 03:44:58,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:58,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=238346.66666666666, ans=0.0 2023-09-29 03:44:59,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 03:45:00,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=238346.66666666666, ans=0.0 2023-09-29 03:45:01,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:45:06,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:08,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:45:13,054 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.089e+02 2.371e+02 2.859e+02 5.421e+02, threshold=4.742e+02, percent-clipped=3.0 2023-09-29 03:45:13,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:13,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 03:45:15,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 03:45:18,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:19,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:20,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:45:20,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:45:21,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:23,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:23,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:45:23,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 03:45:24,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:45:26,766 INFO [train.py:1039] (0/4) Epoch 7, batch 3900, loss[loss=0.2484, simple_loss=0.3025, pruned_loss=0.09721, over 23696.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2888, pruned_loss=0.08018, over 4703454.52 frames. ], batch size: 135, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:45:28,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 03:45:28,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:28,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:29,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:45:29,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:31,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:45:32,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:32,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:33,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:45:33,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 03:45:34,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:39,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:39,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:41,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:45:41,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:43,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:44,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:46,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:45:47,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 03:45:47,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:45:50,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 03:45:50,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:50,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 03:45:52,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 03:45:57,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:45:59,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:45:59,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:46:01,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:05,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:46:07,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:46:12,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:46:12,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:13,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:46:15,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=238680.0, ans=0.125 2023-09-29 03:46:19,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:20,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:46:22,399 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=238680.0, ans=0.125 2023-09-29 03:46:25,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:46:25,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=238680.0, ans=0.125 2023-09-29 03:46:26,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:46:32,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=238746.66666666666, ans=0.125 2023-09-29 03:46:37,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:46:39,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:41,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 03:46:41,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 03:46:41,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:42,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 03:46:44,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:44,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 03:46:48,676 INFO [train.py:1039] (0/4) Epoch 7, batch 3950, loss[loss=0.2265, simple_loss=0.2892, pruned_loss=0.08189, over 23485.00 frames. ], tot_loss[loss=0.2243, simple_loss=0.2882, pruned_loss=0.0802, over 4709264.22 frames. ], batch size: 120, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:46:52,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:54,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 03:46:55,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:46:57,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:47:00,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:47:04,639 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 03:47:05,644 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.86 vs. limit=15.0 2023-09-29 03:47:05,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:06,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 03:47:07,467 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 03:47:07,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:11,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:11,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:47:11,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:14,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 03:47:17,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:47:17,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:17,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:47:18,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:47:18,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:47:28,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=238946.66666666666, ans=0.1 2023-09-29 03:47:30,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:47:31,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:47:31,770 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=238946.66666666666, ans=0.1 2023-09-29 03:47:36,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 03:47:37,700 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.38 vs. limit=6.0 2023-09-29 03:47:44,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 03:47:45,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 03:47:45,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:47:45,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:47:47,707 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.07 vs. limit=15.0 2023-09-29 03:47:51,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:47:51,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=239013.33333333334, ans=0.2 2023-09-29 03:47:52,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:47:52,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:53,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:47:53,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 03:47:53,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=239080.0, ans=0.0 2023-09-29 03:47:55,043 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=239080.0, ans=0.0 2023-09-29 03:47:55,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=239080.0, ans=0.1 2023-09-29 03:47:57,668 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 2.025e+02 2.218e+02 2.611e+02 3.934e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 03:47:57,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:47:58,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:47:58,978 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.86 vs. limit=15.0 2023-09-29 03:48:01,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 03:48:04,978 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=239080.0, ans=0.125 2023-09-29 03:48:11,834 INFO [train.py:1039] (0/4) Epoch 7, batch 4000, loss[loss=0.228, simple_loss=0.3028, pruned_loss=0.07663, over 24649.00 frames. ], tot_loss[loss=0.2261, simple_loss=0.2897, pruned_loss=0.08123, over 4696433.33 frames. ], batch size: 73, lr: 1.45e-02, grad_scale: 32.0 2023-09-29 03:48:12,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:12,330 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=239146.66666666666, ans=0.0 2023-09-29 03:48:17,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=239146.66666666666, ans=0.125 2023-09-29 03:48:18,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:24,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:24,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:48:26,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:26,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 03:48:27,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:48:27,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 03:48:27,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:48:27,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 03:48:30,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:33,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:48:34,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:48:34,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:48:36,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:48:36,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:48:39,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:48:41,265 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 03:48:41,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:48:43,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:48:46,666 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 03:48:48,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:48:49,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:48:51,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=239280.0, ans=0.2 2023-09-29 03:48:54,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=239280.0, ans=0.1 2023-09-29 03:48:57,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 03:48:57,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:49:00,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:49:02,040 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 03:49:03,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:49:03,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 03:49:03,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:05,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:06,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:49:08,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:49:09,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:49:09,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:49:11,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 03:49:11,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:13,304 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 03:49:18,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:49:20,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:49:22,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:49:22,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:23,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:49:25,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:29,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:30,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:49:30,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 03:49:32,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=239480.0, ans=0.05 2023-09-29 03:49:33,606 INFO [train.py:1039] (0/4) Epoch 7, batch 4050, loss[loss=0.2303, simple_loss=0.2845, pruned_loss=0.08807, over 22851.00 frames. ], tot_loss[loss=0.2276, simple_loss=0.2912, pruned_loss=0.08198, over 4696299.39 frames. ], batch size: 322, lr: 1.44e-02, grad_scale: 32.0 2023-09-29 03:49:33,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:49:33,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:49:33,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:49:36,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:49:38,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:41,691 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:49:42,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:46,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:49:46,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:49:50,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:49:51,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:56,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:58,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:50:00,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 03:50:03,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 03:50:03,798 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 03:50:04,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=239546.66666666666, ans=0.09899494936611666 2023-09-29 03:50:05,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:50:06,144 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.01 vs. limit=15.0 2023-09-29 03:50:10,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=239613.33333333334, ans=0.125 2023-09-29 03:50:12,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 03:50:13,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:15,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:16,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=239613.33333333334, ans=0.1 2023-09-29 03:50:18,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:50:19,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:50:19,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:23,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:50:27,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 03:50:27,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:50:28,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:30,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 03:50:34,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:41,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 03:50:43,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:43,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:50:44,862 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.977e+02 2.189e+02 2.469e+02 3.390e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 03:50:47,468 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.67 vs. limit=15.0 2023-09-29 03:50:47,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 03:50:47,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 03:50:47,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:50:51,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:50:52,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:50:52,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:50:56,050 INFO [train.py:1039] (0/4) Epoch 7, batch 4100, loss[loss=0.2327, simple_loss=0.3026, pruned_loss=0.08142, over 23950.00 frames. ], tot_loss[loss=0.2277, simple_loss=0.2918, pruned_loss=0.08175, over 4701167.08 frames. ], batch size: 86, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:51:01,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 03:51:02,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 03:51:04,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 03:51:07,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 03:51:07,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:09,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:51:10,586 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 03:51:12,839 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=239880.0, ans=0.125 2023-09-29 03:51:14,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:15,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:51:15,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:17,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:51:18,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:51:20,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:21,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:51:21,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 03:51:22,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=239880.0, ans=0.125 2023-09-29 03:51:23,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:23,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:51:23,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:23,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:51:23,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 03:51:26,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:28,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 03:51:30,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:51:31,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:31,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 03:51:31,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=239946.66666666666, ans=0.125 2023-09-29 03:51:33,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:51:33,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:51:33,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:51:37,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 03:51:38,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:51:40,510 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-36000.pt 2023-09-29 03:51:45,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:51:45,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=239946.66666666666, ans=0.0 2023-09-29 03:51:47,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 03:51:48,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:50,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:51:53,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:56,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:01,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:02,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:52:09,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:09,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:52:13,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:16,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:52:21,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:52:21,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:52:22,666 INFO [train.py:1039] (0/4) Epoch 7, batch 4150, loss[loss=0.2222, simple_loss=0.2928, pruned_loss=0.07573, over 23392.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.291, pruned_loss=0.08095, over 4709449.47 frames. ], batch size: 93, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:52:22,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:52:22,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:26,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 03:52:26,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:27,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 03:52:29,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 03:52:29,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 03:52:31,257 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=240146.66666666666, ans=0.0 2023-09-29 03:52:32,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:33,220 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.45 vs. limit=15.0 2023-09-29 03:52:38,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:52:38,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:41,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=240213.33333333334, ans=0.2 2023-09-29 03:52:42,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:52:45,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:52:45,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:52:45,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=240213.33333333334, ans=0.0 2023-09-29 03:52:46,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:52:48,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:49,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:52:54,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:56,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:52:58,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 03:53:01,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 03:53:01,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:53:03,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 03:53:03,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:53:03,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:06,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:06,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:10,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 03:53:13,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:14,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=240346.66666666666, ans=0.125 2023-09-29 03:53:16,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:53:18,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 03:53:18,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:53:19,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 03:53:21,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:53:22,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:24,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:24,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 03:53:24,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:24,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:53:25,342 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.46 vs. limit=22.5 2023-09-29 03:53:26,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:53:29,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 03:53:29,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:29,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:53:29,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:53:31,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 03:53:32,547 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.142e+02 2.391e+02 2.660e+02 4.088e+02, threshold=4.782e+02, percent-clipped=0.0 2023-09-29 03:53:32,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:32,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:53:32,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:53:34,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:35,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 03:53:36,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:42,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:53:44,024 INFO [train.py:1039] (0/4) Epoch 7, batch 4200, loss[loss=0.2066, simple_loss=0.2509, pruned_loss=0.08113, over 23468.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2889, pruned_loss=0.08032, over 4708657.19 frames. ], batch size: 285, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:53:44,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 03:53:45,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:53:48,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:53:51,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:53:51,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:51,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:54,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 03:53:57,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 03:53:57,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:59,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:01,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:54:05,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:54:09,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:09,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:10,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 03:54:10,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:11,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:11,256 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=240546.66666666666, ans=10.0 2023-09-29 03:54:12,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:54:12,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:54:15,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:54:18,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 03:54:18,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:21,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:54:23,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:54:27,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:54:30,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:54:31,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:54:32,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 03:54:33,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:33,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:54:38,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:54:39,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:42,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=240680.0, ans=0.025 2023-09-29 03:54:45,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:54:45,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=240680.0, ans=0.125 2023-09-29 03:54:49,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 03:54:52,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:56,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:54:56,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:54:58,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 03:55:05,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:55:05,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=240813.33333333334, ans=0.125 2023-09-29 03:55:06,375 INFO [train.py:1039] (0/4) Epoch 7, batch 4250, loss[loss=0.2086, simple_loss=0.2391, pruned_loss=0.089, over 19292.00 frames. ], tot_loss[loss=0.223, simple_loss=0.2873, pruned_loss=0.0793, over 4703149.33 frames. ], batch size: 389, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:55:09,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:55:09,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:55:12,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:17,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:55:19,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 03:55:19,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:55:21,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:24,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:24,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=240880.0, ans=0.125 2023-09-29 03:55:29,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:31,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:33,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:55:33,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:55:33,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=240880.0, ans=0.125 2023-09-29 03:55:34,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:34,943 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=240880.0, ans=0.1 2023-09-29 03:55:36,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:37,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:40,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:55:42,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:55:44,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 03:55:44,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=240946.66666666666, ans=0.0 2023-09-29 03:55:47,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 03:55:47,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:49,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:49,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:51,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:55:51,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:52,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:53,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=240946.66666666666, ans=0.2 2023-09-29 03:55:55,357 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=15.0 2023-09-29 03:55:56,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:55:57,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:55:58,347 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.49 vs. limit=15.0 2023-09-29 03:56:00,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:02,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:03,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 03:56:03,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:56:06,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 03:56:07,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:56:09,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:56:12,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:12,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:56:13,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 03:56:14,676 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.57 vs. limit=12.0 2023-09-29 03:56:15,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:56:16,640 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.803e+02 2.211e+02 2.441e+02 2.743e+02 4.963e+02, threshold=4.882e+02, percent-clipped=1.0 2023-09-29 03:56:16,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:56:17,036 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=241080.0, ans=0.05 2023-09-29 03:56:19,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:23,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:25,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:56:27,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:28,518 INFO [train.py:1039] (0/4) Epoch 7, batch 4300, loss[loss=0.2211, simple_loss=0.2802, pruned_loss=0.081, over 23548.00 frames. ], tot_loss[loss=0.2228, simple_loss=0.2876, pruned_loss=0.07895, over 4717158.85 frames. ], batch size: 149, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:56:28,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:30,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:56:30,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:56:30,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 03:56:31,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:37,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:37,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:56:38,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=241146.66666666666, ans=0.0 2023-09-29 03:56:43,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:44,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=241213.33333333334, ans=0.125 2023-09-29 03:56:47,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=241213.33333333334, ans=0.1 2023-09-29 03:56:47,170 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=241213.33333333334, ans=0.1 2023-09-29 03:56:50,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:50,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 03:56:51,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:56:54,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:56:54,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:56:54,664 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 03:57:01,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:57:01,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:04,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 03:57:04,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:57:05,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 03:57:07,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:57:09,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:57:11,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:57:11,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:57:13,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:57:14,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:16,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:57:16,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 03:57:16,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 03:57:20,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:57:23,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:23,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:57:23,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:23,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=241346.66666666666, ans=0.09899494936611666 2023-09-29 03:57:24,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:24,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 03:57:24,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 03:57:25,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 03:57:26,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:57:27,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 03:57:27,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 03:57:29,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:32,847 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 03:57:32,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:57:35,317 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=241413.33333333334, ans=0.125 2023-09-29 03:57:36,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:36,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:39,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 03:57:41,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:41,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:41,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:57:41,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:41,991 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.21 vs. limit=6.0 2023-09-29 03:57:42,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:57:42,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:57:44,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:46,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:46,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:47,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=241413.33333333334, ans=0.125 2023-09-29 03:57:51,828 INFO [train.py:1039] (0/4) Epoch 7, batch 4350, loss[loss=0.2117, simple_loss=0.2889, pruned_loss=0.06729, over 24684.00 frames. ], tot_loss[loss=0.2239, simple_loss=0.2884, pruned_loss=0.07968, over 4703081.43 frames. ], batch size: 65, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:57:53,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 03:57:53,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:57:58,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:01,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:04,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:58:04,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:58:05,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=241480.0, ans=0.125 2023-09-29 03:58:10,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:58:13,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:15,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:58:16,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:58:16,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=241546.66666666666, ans=0.0 2023-09-29 03:58:20,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:58:23,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:58:24,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:58:29,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 03:58:31,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:32,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:37,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:39,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 03:58:44,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:58:46,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:58:50,882 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 03:58:52,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:52,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:58:53,858 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 03:58:56,003 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 03:58:56,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:56,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:57,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:58:57,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:59,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:59,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:02,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 03:59:02,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:02,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:02,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:04,159 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.126e+02 2.357e+02 2.632e+02 4.633e+02, threshold=4.715e+02, percent-clipped=0.0 2023-09-29 03:59:04,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 03:59:05,838 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 03:59:05,846 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 03:59:05,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 03:59:08,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:59:09,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:59:09,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:10,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:59:12,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 03:59:13,470 INFO [train.py:1039] (0/4) Epoch 7, batch 4400, loss[loss=0.232, simple_loss=0.2928, pruned_loss=0.08565, over 19907.00 frames. ], tot_loss[loss=0.2248, simple_loss=0.2898, pruned_loss=0.07987, over 4714142.11 frames. ], batch size: 43, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:59:15,072 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 03:59:15,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:15,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=241813.33333333334, ans=0.1 2023-09-29 03:59:19,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:21,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:22,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:24,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 03:59:24,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 03:59:25,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 03:59:25,820 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 03:59:25,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:59:25,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:29,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 03:59:31,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:32,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:32,585 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 03:59:37,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:37,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 03:59:39,149 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 03:59:42,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 03:59:43,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 03:59:43,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 03:59:43,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:45,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:45,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:46,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:48,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 03:59:48,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 03:59:50,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:51,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:59:51,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:54,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:54,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:54,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 03:59:55,854 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 03:59:58,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:05,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:00:05,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=242013.33333333334, ans=0.125 2023-09-29 04:00:08,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 04:00:15,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:00:16,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:18,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:00:19,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 04:00:19,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:00:19,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:00:19,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:00:21,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:00:25,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 04:00:30,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 04:00:31,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 04:00:31,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:00:31,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 04:00:34,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:00:36,376 INFO [train.py:1039] (0/4) Epoch 7, batch 4450, loss[loss=0.2478, simple_loss=0.3173, pruned_loss=0.08914, over 24427.00 frames. ], tot_loss[loss=0.2269, simple_loss=0.2913, pruned_loss=0.08122, over 4710279.61 frames. ], batch size: 77, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:00:36,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:00:38,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 04:00:43,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:44,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:44,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:00:51,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=242213.33333333334, ans=0.0 2023-09-29 04:00:52,263 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.42 vs. limit=15.0 2023-09-29 04:00:52,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:00:52,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:00:54,052 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-09-29 04:00:54,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=242213.33333333334, ans=0.125 2023-09-29 04:00:56,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:56,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:01:01,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:01:01,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:02,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 04:01:02,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:02,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:02,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:02,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:01:07,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:01:12,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:12,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:14,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:14,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:16,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:01:21,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:01:22,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 04:01:22,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 04:01:22,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:01:25,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:27,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 04:01:30,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:01:34,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:35,326 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.98 vs. limit=15.0 2023-09-29 04:01:35,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 04:01:35,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:35,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:35,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:01:35,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:39,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:41,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:01:42,113 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.26 vs. limit=15.0 2023-09-29 04:01:42,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 04:01:44,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:01:45,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:47,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:49,225 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.081e+02 2.382e+02 2.836e+02 4.315e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 04:01:49,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:50,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:01:52,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:01:54,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=242413.33333333334, ans=0.125 2023-09-29 04:01:55,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 04:01:57,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:01:59,209 INFO [train.py:1039] (0/4) Epoch 7, batch 4500, loss[loss=0.199, simple_loss=0.2566, pruned_loss=0.07075, over 23558.00 frames. ], tot_loss[loss=0.2276, simple_loss=0.2914, pruned_loss=0.08194, over 4697368.41 frames. ], batch size: 134, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:01:59,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=242480.0, ans=0.1 2023-09-29 04:02:04,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:05,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 04:02:05,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 04:02:08,014 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.54 vs. limit=15.0 2023-09-29 04:02:08,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:12,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:02:12,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:14,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:02:15,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:02:15,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:15,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:16,211 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=242546.66666666666, ans=0.125 2023-09-29 04:02:27,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:29,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:02:31,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:31,296 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=242613.33333333334, ans=0.125 2023-09-29 04:02:31,441 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=242613.33333333334, ans=0.1 2023-09-29 04:02:32,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:02:34,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:02:37,465 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=242613.33333333334, ans=0.2 2023-09-29 04:02:40,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:02:44,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:02:49,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:02:50,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:02:52,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 04:02:52,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:02:54,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:56,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:57,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:57,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=242680.0, ans=0.125 2023-09-29 04:02:59,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:59,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 04:02:59,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:02:59,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:02:59,425 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=242680.0, ans=0.125 2023-09-29 04:03:04,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:03:04,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:03:04,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=242746.66666666666, ans=0.2 2023-09-29 04:03:07,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:10,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:03:10,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:03:13,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 04:03:13,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 04:03:13,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 04:03:19,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 04:03:22,555 INFO [train.py:1039] (0/4) Epoch 7, batch 4550, loss[loss=0.1977, simple_loss=0.2724, pruned_loss=0.06152, over 24478.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.29, pruned_loss=0.08121, over 4693699.53 frames. ], batch size: 66, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:03:22,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 04:03:22,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:24,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=242813.33333333334, ans=0.0 2023-09-29 04:03:27,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:28,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=242813.33333333334, ans=0.0 2023-09-29 04:03:29,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:31,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:36,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:03:37,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:03:39,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:03:40,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:03:40,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:42,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:44,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:45,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:03:48,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 04:03:49,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 04:03:51,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:03:52,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 04:03:52,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=242880.0, ans=0.2 2023-09-29 04:03:57,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 04:03:57,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:03:59,862 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.68 vs. limit=15.0 2023-09-29 04:04:02,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 04:04:04,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:04:05,007 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=11.74 vs. limit=15.0 2023-09-29 04:04:07,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:07,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:07,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:04:09,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 04:04:12,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:15,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:15,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:04:17,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:17,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 04:04:17,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 04:04:18,277 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.45 vs. limit=15.0 2023-09-29 04:04:18,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:04:19,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 04:04:20,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 04:04:20,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:23,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:23,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:25,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:25,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:04:27,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:04:29,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 04:04:31,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:31,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:04:31,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 04:04:31,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:04:31,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 04:04:34,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:04:34,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:04:35,190 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=243080.0, ans=0.125 2023-09-29 04:04:36,258 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.221e+02 2.599e+02 3.023e+02 5.403e+02, threshold=5.198e+02, percent-clipped=1.0 2023-09-29 04:04:37,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:04:40,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:40,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:04:40,669 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.88 vs. limit=22.5 2023-09-29 04:04:41,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:04:43,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:04:46,064 INFO [train.py:1039] (0/4) Epoch 7, batch 4600, loss[loss=0.1841, simple_loss=0.2547, pruned_loss=0.05671, over 24306.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2884, pruned_loss=0.08043, over 4703047.68 frames. ], batch size: 56, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:04:46,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:47,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:50,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:04:50,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:04:52,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:04:53,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 04:04:55,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:04:56,373 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.66 vs. limit=15.0 2023-09-29 04:04:59,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:05:02,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:03,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:10,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 04:05:12,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:15,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:16,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=243213.33333333334, ans=10.0 2023-09-29 04:05:17,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:05:17,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:20,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=243280.0, ans=0.0 2023-09-29 04:05:23,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 04:05:23,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:05:25,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:05:27,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=243280.0, ans=0.125 2023-09-29 04:05:32,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:32,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:05:34,676 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.27 vs. limit=15.0 2023-09-29 04:05:35,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:05:39,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 04:05:39,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=243346.66666666666, ans=0.1 2023-09-29 04:05:41,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:05:45,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:46,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:05:48,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:48,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 04:05:50,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:50,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 04:05:50,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:50,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:52,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:53,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:55,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:56,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 04:05:56,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 04:05:56,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 04:05:56,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:05:58,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:05:59,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:01,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:06:07,198 INFO [train.py:1039] (0/4) Epoch 7, batch 4650, loss[loss=0.1971, simple_loss=0.2627, pruned_loss=0.06572, over 20188.00 frames. ], tot_loss[loss=0.2242, simple_loss=0.288, pruned_loss=0.08014, over 4704549.45 frames. ], batch size: 44, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:06:12,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:06:16,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:16,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:18,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:06:18,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:18,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:06:19,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:22,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 04:06:23,903 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.71 vs. limit=15.0 2023-09-29 04:06:26,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:06:29,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 04:06:29,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:29,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 04:06:31,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:06:31,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 04:06:32,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 04:06:32,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:32,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:06:34,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:06:37,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:37,497 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 04:06:41,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:43,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 04:06:46,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:46,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:06:46,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 04:06:48,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:06:51,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:06:52,145 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:06:52,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=243613.33333333334, ans=0.1 2023-09-29 04:06:56,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:00,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:01,982 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=243680.0, ans=0.0 2023-09-29 04:07:04,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:04,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:06,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:07:06,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 04:07:07,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 04:07:07,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 04:07:07,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 04:07:09,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=243680.0, ans=0.0 2023-09-29 04:07:10,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:15,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=243746.66666666666, ans=0.035 2023-09-29 04:07:18,486 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.995e+02 2.331e+02 2.666e+02 3.727e+02, threshold=4.663e+02, percent-clipped=0.0 2023-09-29 04:07:18,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:07:18,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:18,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 04:07:18,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:20,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:20,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:07:22,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:07:24,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:07:24,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:26,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:27,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:29,921 INFO [train.py:1039] (0/4) Epoch 7, batch 4700, loss[loss=0.2519, simple_loss=0.3216, pruned_loss=0.09113, over 24578.00 frames. ], tot_loss[loss=0.2235, simple_loss=0.288, pruned_loss=0.07953, over 4712938.98 frames. ], batch size: 71, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:07:30,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:07:30,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:07:31,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:07:32,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:07:33,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 04:07:40,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:42,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:44,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:07:45,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:47,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:07:50,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 04:07:51,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 04:07:53,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:54,043 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=243880.0, ans=0.1 2023-09-29 04:07:55,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:07:56,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=243880.0, ans=0.125 2023-09-29 04:07:57,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:08:01,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:06,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:08:09,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:08:11,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:08:12,956 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:08:17,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 04:08:19,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:08:22,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:25,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 04:08:26,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:08:27,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=244013.33333333334, ans=0.0 2023-09-29 04:08:32,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:08:32,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 04:08:32,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=244013.33333333334, ans=0.2 2023-09-29 04:08:33,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:33,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:36,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:38,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:08:38,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 04:08:38,517 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 04:08:41,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:42,737 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.72 vs. limit=10.0 2023-09-29 04:08:43,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 04:08:46,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:49,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 04:08:50,858 INFO [train.py:1039] (0/4) Epoch 7, batch 4750, loss[loss=0.211, simple_loss=0.2737, pruned_loss=0.07413, over 23376.00 frames. ], tot_loss[loss=0.2251, simple_loss=0.2892, pruned_loss=0.08044, over 4704187.71 frames. ], batch size: 134, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:08:52,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:08:52,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:58,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:58,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:08:59,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=244146.66666666666, ans=0.025 2023-09-29 04:09:01,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 04:09:01,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:02,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=244146.66666666666, ans=0.2 2023-09-29 04:09:05,529 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.91 vs. limit=15.0 2023-09-29 04:09:06,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 04:09:09,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:09:09,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:09:09,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:14,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 04:09:15,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=244213.33333333334, ans=0.025 2023-09-29 04:09:19,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:09:20,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 04:09:21,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:25,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:26,990 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 04:09:26,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 04:09:34,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 04:09:37,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:40,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:09:41,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=244346.66666666666, ans=0.125 2023-09-29 04:09:42,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:09:42,835 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 04:09:42,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:09:46,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:09:46,993 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=244346.66666666666, ans=0.1 2023-09-29 04:09:48,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:09:51,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 04:09:51,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 04:09:51,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:51,748 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.87 vs. limit=10.0 2023-09-29 04:09:53,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:09:53,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:53,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=244346.66666666666, ans=0.0 2023-09-29 04:09:54,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:09:54,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 04:09:57,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 04:10:00,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:02,306 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.924e+02 2.114e+02 2.511e+02 3.995e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 04:10:02,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:10:02,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 04:10:02,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:04,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:05,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=244413.33333333334, ans=0.125 2023-09-29 04:10:06,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:10:06,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:08,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:10:10,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:11,392 INFO [train.py:1039] (0/4) Epoch 7, batch 4800, loss[loss=0.2018, simple_loss=0.2813, pruned_loss=0.06116, over 24437.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.2904, pruned_loss=0.08099, over 4712497.35 frames. ], batch size: 63, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:10:11,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 04:10:11,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 04:10:13,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 04:10:17,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:10:19,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:20,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 04:10:20,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=244480.0, ans=0.0 2023-09-29 04:10:26,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:27,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:32,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:10:32,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:32,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:33,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 04:10:35,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:35,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:10:35,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:10:37,329 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.911e-03 2023-09-29 04:10:41,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:10:42,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:42,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:10:43,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=244613.33333333334, ans=0.0 2023-09-29 04:10:44,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:44,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:10:44,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:44,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:48,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:52,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:10:55,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:10:57,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:58,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 04:10:58,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 04:11:00,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:00,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:11:00,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:11:00,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:00,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:11:02,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:11:02,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:07,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:09,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:11,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:15,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 04:11:17,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:17,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:17,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:11:17,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:23,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:23,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:11:23,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:25,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:11:25,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:11:26,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:11:30,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:30,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:30,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:31,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 04:11:34,719 INFO [train.py:1039] (0/4) Epoch 7, batch 4850, loss[loss=0.2176, simple_loss=0.2894, pruned_loss=0.07292, over 24632.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.291, pruned_loss=0.08117, over 4715270.81 frames. ], batch size: 65, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:11:36,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 04:11:36,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:36,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:37,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:11:37,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:40,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:47,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=244813.33333333334, ans=0.125 2023-09-29 04:11:48,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 04:11:49,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:54,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:11:54,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:11:56,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:00,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:12:00,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:12:03,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:12:03,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 04:12:07,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:12:09,458 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.09 vs. limit=15.0 2023-09-29 04:12:10,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:12:10,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:12:10,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:12:10,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 04:12:14,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:12:14,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:18,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:18,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 04:12:19,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 04:12:20,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:12:25,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:12:27,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 04:12:29,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:12:29,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:12:31,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:12:32,210 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.01 vs. limit=15.0 2023-09-29 04:12:33,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 04:12:33,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:35,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 04:12:35,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:38,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:12:38,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 04:12:39,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=245080.0, ans=0.125 2023-09-29 04:12:46,411 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.140e+02 2.410e+02 2.869e+02 4.952e+02, threshold=4.821e+02, percent-clipped=3.0 2023-09-29 04:12:46,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:52,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:12:52,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:12:54,116 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.60 vs. limit=6.0 2023-09-29 04:12:55,906 INFO [train.py:1039] (0/4) Epoch 7, batch 4900, loss[loss=0.2447, simple_loss=0.3028, pruned_loss=0.09333, over 23216.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.2902, pruned_loss=0.08052, over 4715990.52 frames. ], batch size: 105, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:12:57,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 04:12:57,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:13:03,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:05,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:05,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:13:09,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 04:13:09,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=245146.66666666666, ans=0.125 2023-09-29 04:13:09,923 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=17.33 vs. limit=15.0 2023-09-29 04:13:11,915 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=245213.33333333334, ans=0.125 2023-09-29 04:13:15,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 04:13:18,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.97 vs. limit=15.0 2023-09-29 04:13:19,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 04:13:20,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 04:13:20,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:22,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:22,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:13:22,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:22,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:13:22,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 04:13:25,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 04:13:25,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:13:28,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:13:28,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:28,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=245280.0, ans=0.125 2023-09-29 04:13:31,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:13:31,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:33,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:33,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 04:13:33,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:13:35,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:35,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 04:13:35,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 04:13:37,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=245280.0, ans=0.2 2023-09-29 04:13:41,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 04:13:42,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:13:44,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:13:44,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:13:46,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:46,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:13:46,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:13:47,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 04:13:50,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:52,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:13:53,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:13:57,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 04:13:58,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:13:58,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:13:58,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 04:14:00,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=245413.33333333334, ans=0.125 2023-09-29 04:14:02,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=245413.33333333334, ans=0.125 2023-09-29 04:14:03,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:05,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:06,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 04:14:06,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:06,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:14:07,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=245413.33333333334, ans=0.0 2023-09-29 04:14:08,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=245413.33333333334, ans=0.1 2023-09-29 04:14:10,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:14,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:15,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:14:15,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:15,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 04:14:16,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:14:18,192 INFO [train.py:1039] (0/4) Epoch 7, batch 4950, loss[loss=0.2045, simple_loss=0.266, pruned_loss=0.07145, over 24304.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.2886, pruned_loss=0.0798, over 4726613.19 frames. ], batch size: 56, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:14:19,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:20,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:23,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 04:14:23,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=245480.0, ans=0.125 2023-09-29 04:14:24,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 04:14:24,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:14:25,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 04:14:25,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:26,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:14:26,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:14:26,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:28,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:29,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:14:31,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:14:33,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:34,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:34,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:38,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:14:47,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:48,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:50,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:50,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:52,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:14:53,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 04:14:53,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 04:14:57,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:58,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:15:00,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:15:01,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:01,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:15:03,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:15:04,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:07,328 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-09-29 04:15:07,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:15:08,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:15:09,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:09,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:11,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 04:15:11,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:15:15,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:15:18,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:15:21,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:15:21,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:15:21,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:21,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=245680.0, ans=0.2 2023-09-29 04:15:23,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:15:23,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:15:25,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:15:26,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:15:26,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:28,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 04:15:31,891 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.077e+02 2.324e+02 2.627e+02 6.143e+02, threshold=4.647e+02, percent-clipped=3.0 2023-09-29 04:15:32,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:15:39,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 04:15:39,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:15:40,888 INFO [train.py:1039] (0/4) Epoch 7, batch 5000, loss[loss=0.199, simple_loss=0.2761, pruned_loss=0.0609, over 24307.00 frames. ], tot_loss[loss=0.2235, simple_loss=0.2882, pruned_loss=0.07939, over 4720856.44 frames. ], batch size: 61, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:15:47,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:47,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:15:49,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 04:15:49,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 04:15:53,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:15:56,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 04:15:56,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:56,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:15:56,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 04:15:56,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:58,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:15:59,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 04:15:59,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:15:59,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:01,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 04:16:01,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 04:16:03,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:16:03,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 04:16:03,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:16:04,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:04,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:16:04,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 04:16:04,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 04:16:07,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 04:16:07,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:08,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:08,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=245880.0, ans=0.125 2023-09-29 04:16:10,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 04:16:11,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:16:11,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:12,410 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.27 vs. limit=10.0 2023-09-29 04:16:13,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:16:16,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:16:19,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 04:16:19,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:16:21,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:16:24,578 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 04:16:28,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:16:28,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:28,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:16:33,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 04:16:34,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:34,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:34,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:16:36,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 04:16:37,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:40,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:42,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:16:50,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 04:16:53,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:02,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:03,839 INFO [train.py:1039] (0/4) Epoch 7, batch 5050, loss[loss=0.2361, simple_loss=0.2935, pruned_loss=0.08933, over 22770.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.2892, pruned_loss=0.0795, over 4721208.25 frames. ], batch size: 322, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:17:04,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:05,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:17:05,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:05,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:17:06,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:17:08,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:13,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:13,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 04:17:14,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:17:16,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:18,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:17:18,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 04:17:20,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:20,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:17:23,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:17:23,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:17:24,197 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.98 vs. limit=10.0 2023-09-29 04:17:24,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:17:28,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=246213.33333333334, ans=0.125 2023-09-29 04:17:29,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=246213.33333333334, ans=0.1 2023-09-29 04:17:31,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 04:17:31,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:17:33,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:33,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 04:17:33,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:17:35,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:37,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:37,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:17:37,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 04:17:38,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 04:17:40,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:42,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:17:45,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:47,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 04:17:48,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:17:50,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 04:17:52,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:17:52,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:17:52,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:53,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:55,852 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.36 vs. limit=15.0 2023-09-29 04:17:56,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:17:58,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:17:59,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:59,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:59,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:17:59,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 04:17:59,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:18:03,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:18:06,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:18:06,587 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 04:18:06,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:18:08,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:10,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:10,346 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 04:18:10,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=246413.33333333334, ans=0.125 2023-09-29 04:18:12,323 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:18:13,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:13,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 04:18:13,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:18,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:18,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:18,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 04:18:19,881 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 2.259e+02 2.586e+02 3.154e+02 5.284e+02, threshold=5.172e+02, percent-clipped=3.0 2023-09-29 04:18:20,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 04:18:24,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:24,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:18:24,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:18:26,483 INFO [train.py:1039] (0/4) Epoch 7, batch 5100, loss[loss=0.2304, simple_loss=0.2853, pruned_loss=0.08772, over 23771.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.2902, pruned_loss=0.08033, over 4724428.47 frames. ], batch size: 212, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:18:28,061 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 04:18:29,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=246480.0, ans=0.0 2023-09-29 04:18:31,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:35,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 04:18:35,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 04:18:35,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:37,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:18:42,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:42,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 04:18:42,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 04:18:46,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=246546.66666666666, ans=0.0 2023-09-29 04:18:47,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:47,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:18:53,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:54,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=246546.66666666666, ans=0.025 2023-09-29 04:18:57,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 04:18:57,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:00,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:19:00,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 04:19:02,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 04:19:07,142 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 04:19:07,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:07,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 04:19:07,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 04:19:08,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=246613.33333333334, ans=0.0 2023-09-29 04:19:11,340 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.51 vs. limit=6.0 2023-09-29 04:19:12,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:19,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=246680.0, ans=0.125 2023-09-29 04:19:19,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=246680.0, ans=0.125 2023-09-29 04:19:22,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:23,564 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.35 vs. limit=15.0 2023-09-29 04:19:25,117 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.56 vs. limit=15.0 2023-09-29 04:19:25,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 04:19:27,566 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 04:19:27,578 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 04:19:29,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 04:19:29,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:32,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 04:19:35,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 04:19:37,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:19:38,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:19:40,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 04:19:43,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:19:45,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 04:19:48,703 INFO [train.py:1039] (0/4) Epoch 7, batch 5150, loss[loss=0.2132, simple_loss=0.2893, pruned_loss=0.06857, over 24633.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2898, pruned_loss=0.07964, over 4722948.19 frames. ], batch size: 65, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:19:50,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:19:50,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:19:50,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:19:51,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:19:52,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:19:53,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:19:54,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 04:19:54,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 04:19:55,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 04:19:55,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:19:55,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 04:19:58,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:58,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:19:58,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:01,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:02,703 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.16 vs. limit=15.0 2023-09-29 04:20:07,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:20:07,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 04:20:08,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:09,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:20:11,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:20:11,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:11,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:12,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:20:12,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:20:12,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 04:20:14,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:20:14,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:16,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:20:18,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 04:20:19,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:20:24,999 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=246946.66666666666, ans=0.0 2023-09-29 04:20:26,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:20:29,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 04:20:32,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:39,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:40,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:44,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:44,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:47,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 04:20:49,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:50,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:20:50,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:55,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:55,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:55,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 04:21:00,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:03,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:21:05,021 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.008e+02 2.205e+02 2.538e+02 3.618e+02, threshold=4.410e+02, percent-clipped=0.0 2023-09-29 04:21:05,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:21:05,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:21:06,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:21:06,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:21:06,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:21:06,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:21:10,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:21:10,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=247146.66666666666, ans=0.125 2023-09-29 04:21:11,425 INFO [train.py:1039] (0/4) Epoch 7, batch 5200, loss[loss=0.2221, simple_loss=0.2691, pruned_loss=0.08755, over 22714.00 frames. ], tot_loss[loss=0.2274, simple_loss=0.292, pruned_loss=0.08142, over 4685437.08 frames. ], batch size: 322, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:21:12,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:21:14,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:14,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=247146.66666666666, ans=0.125 2023-09-29 04:21:19,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 04:21:21,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:21:22,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:25,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:27,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:21:27,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:27,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=247213.33333333334, ans=0.07 2023-09-29 04:21:30,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 04:21:32,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=247213.33333333334, ans=0.1 2023-09-29 04:21:33,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:21:35,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:36,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 04:21:38,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:21:39,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=247213.33333333334, ans=0.125 2023-09-29 04:21:41,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:21:42,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 04:21:42,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 04:21:45,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 04:21:47,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:47,072 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 04:21:47,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:48,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:49,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:21:50,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 04:21:50,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:21:53,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:57,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 04:21:57,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 04:21:57,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 04:22:01,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 04:22:03,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:22:09,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:22:09,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:11,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 04:22:11,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:22:13,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:22:13,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:13,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:18,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:18,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:22:21,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:22:22,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:22,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:27,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:28,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 04:22:30,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:30,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:22:31,763 INFO [train.py:1039] (0/4) Epoch 7, batch 5250, loss[loss=0.1961, simple_loss=0.2658, pruned_loss=0.06318, over 24628.00 frames. ], tot_loss[loss=0.2249, simple_loss=0.2899, pruned_loss=0.07996, over 4694658.60 frames. ], batch size: 60, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:22:31,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:31,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:22:34,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:22:34,870 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.whiten.whitening_limit, batch_count=247480.0, ans=12.0 2023-09-29 04:22:35,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:22:40,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:40,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:22:41,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:22:48,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:49,356 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=247546.66666666666, ans=0.125 2023-09-29 04:22:50,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:22:52,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:22:53,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:55,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 04:22:55,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:57,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:23:09,796 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=247613.33333333334, ans=0.0 2023-09-29 04:23:41,160 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.140e+02 2.318e+02 2.697e+02 3.802e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 04:23:47,119 INFO [train.py:1039] (0/4) Epoch 7, batch 5300, loss[loss=0.2435, simple_loss=0.2937, pruned_loss=0.09663, over 23964.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.2889, pruned_loss=0.07966, over 4698183.19 frames. ], batch size: 196, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:23:53,642 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.94 vs. limit=15.0 2023-09-29 04:23:57,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=247813.33333333334, ans=0.125 2023-09-29 04:24:01,810 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-7.pt 2023-09-29 04:24:07,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:24:07,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 04:24:07,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 04:24:07,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:07,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:07,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:07,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:08,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:08,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:08,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:08,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:24:09,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:24:09,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 04:24:09,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 04:24:09,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 04:24:09,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:24:09,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 04:24:09,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 04:24:09,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:10,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:10,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:10,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:10,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:24:11,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:11,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:11,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:11,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:11,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:11,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:24:11,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:11,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:24:12,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 04:24:12,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:13,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:13,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 04:24:13,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 04:24:13,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:24:13,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:13,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 04:24:13,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 04:24:13,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:14,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:24:14,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:15,042 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 04:24:15,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 04:24:15,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:24:15,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:15,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 04:24:15,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 04:24:15,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 04:24:15,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:19,230 INFO [train.py:1039] (0/4) Epoch 8, batch 0, loss[loss=0.2309, simple_loss=0.2919, pruned_loss=0.08491, over 23448.00 frames. ], tot_loss[loss=0.2309, simple_loss=0.2919, pruned_loss=0.08491, over 23448.00 frames. ], batch size: 134, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:24:19,231 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 04:24:33,513 INFO [train.py:1071] (0/4) Epoch 8, validation: loss=0.2869, simple_loss=0.2985, pruned_loss=0.1377, over 1125622.00 frames. 2023-09-29 04:24:33,514 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 04:24:33,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 04:24:35,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:24:36,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:24:39,288 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.50 vs. limit=22.5 2023-09-29 04:24:41,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:41,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:24:41,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:42,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 04:24:45,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 04:24:47,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:49,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:55,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:24:55,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:24:57,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 04:25:01,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:25:10,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:25:10,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:13,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 04:25:17,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:25:17,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:25:19,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:24,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:25:30,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:35,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 04:25:39,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 04:25:39,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:25:39,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:40,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:25:41,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:46,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 04:25:47,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:49,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:52,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:25:54,279 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=248226.66666666666, ans=0.2 2023-09-29 04:25:55,305 INFO [train.py:1039] (0/4) Epoch 8, batch 50, loss[loss=0.2357, simple_loss=0.3132, pruned_loss=0.07909, over 24313.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.2926, pruned_loss=0.08028, over 1061091.32 frames. ], batch size: 74, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:25:55,404 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 04:25:55,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:26:00,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:01,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:01,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 04:26:03,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:26:03,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:26:05,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=248226.66666666666, ans=0.0 2023-09-29 04:26:06,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:07,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:09,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:12,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 04:26:14,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:23,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:26:24,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 04:26:26,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 04:26:27,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:26:29,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:26:29,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:31,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:26:31,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:26:32,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:26:32,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:36,987 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.05 vs. limit=22.5 2023-09-29 04:26:40,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:42,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:26:42,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:26:42,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 04:26:45,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:26:46,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:26:46,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 04:26:48,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:48,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 04:26:50,452 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.737e+02 2.177e+02 2.443e+02 2.821e+02 4.431e+02, threshold=4.886e+02, percent-clipped=0.0 2023-09-29 04:26:52,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=248426.66666666666, ans=0.125 2023-09-29 04:26:56,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:26:57,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:58,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:58,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.whiten.whitening_limit, batch_count=248426.66666666666, ans=12.0 2023-09-29 04:27:00,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:00,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:03,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 04:27:04,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 04:27:05,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:05,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:07,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:27:07,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:27:07,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 04:27:08,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 04:27:10,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:27:12,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:12,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:27:13,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 04:27:13,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 04:27:13,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:15,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:16,871 INFO [train.py:1039] (0/4) Epoch 8, batch 100, loss[loss=0.1833, simple_loss=0.2567, pruned_loss=0.05495, over 24523.00 frames. ], tot_loss[loss=0.2276, simple_loss=0.2928, pruned_loss=0.08121, over 1869029.10 frames. ], batch size: 63, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:27:16,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:27:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:27:18,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:27:23,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:27:26,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:30,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 04:27:30,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:34,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:27:34,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:34,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:34,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:34,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:37,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 04:27:38,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:27:40,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:40,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:40,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:44,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 04:27:45,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:46,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:48,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:27:49,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:27:52,896 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 04:27:52,926 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 04:27:54,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:27:54,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:27:57,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=248693.33333333334, ans=0.0 2023-09-29 04:27:59,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:28:01,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:01,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:07,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:08,995 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 04:28:10,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:28:13,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:15,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:28:18,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:20,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:23,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:24,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:28:27,003 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=13.45 vs. limit=15.0 2023-09-29 04:28:29,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:29,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:31,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:31,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:28:31,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:31,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=248826.66666666666, ans=0.125 2023-09-29 04:28:32,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 04:28:33,013 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 04:28:33,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:33,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:28:34,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:34,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:34,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:28:34,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:28:34,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:28:34,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:36,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:37,582 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.33 vs. limit=22.5 2023-09-29 04:28:38,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:40,404 INFO [train.py:1039] (0/4) Epoch 8, batch 150, loss[loss=0.2226, simple_loss=0.2993, pruned_loss=0.07294, over 23991.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2916, pruned_loss=0.08063, over 2499576.20 frames. ], batch size: 80, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:28:40,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:28:40,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:28:40,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=248893.33333333334, ans=0.125 2023-09-29 04:28:43,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:46,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:46,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:28:46,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:48,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:50,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:51,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:51,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:55,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 04:28:56,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 04:28:56,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 04:28:57,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=248960.0, ans=0.125 2023-09-29 04:28:59,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:59,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:29:01,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:29:04,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:29:04,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:05,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:06,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:08,168 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 04:29:10,859 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.91 vs. limit=15.0 2023-09-29 04:29:11,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:15,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:20,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:29:20,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 04:29:23,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=249026.66666666666, ans=0.0 2023-09-29 04:29:23,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=249026.66666666666, ans=0.125 2023-09-29 04:29:24,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:29:24,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:24,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:26,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:29:28,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:29:30,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:29:31,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:31,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 04:29:36,128 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.113e+02 2.401e+02 2.733e+02 5.079e+02, threshold=4.803e+02, percent-clipped=1.0 2023-09-29 04:29:38,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:40,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:29:40,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:29:40,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:29:43,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:45,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 04:29:48,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:29:50,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:29:52,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:29:55,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:29:55,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 04:29:55,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:55,499 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 04:29:55,740 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=249160.0, ans=0.125 2023-09-29 04:29:58,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:00,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=249160.0, ans=0.125 2023-09-29 04:30:03,312 INFO [train.py:1039] (0/4) Epoch 8, batch 200, loss[loss=0.2314, simple_loss=0.2859, pruned_loss=0.08847, over 23786.00 frames. ], tot_loss[loss=0.2261, simple_loss=0.291, pruned_loss=0.08055, over 2993648.86 frames. ], batch size: 212, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:30:03,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:30:03,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:30:05,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 04:30:07,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:07,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:10,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 04:30:11,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:30:13,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:15,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:18,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:30:18,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:18,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:30,585 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=249293.33333333334, ans=0.1 2023-09-29 04:30:38,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=249360.0, ans=0.0 2023-09-29 04:30:43,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:30:43,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:30:44,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:30:46,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:30:46,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:30:46,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:30:47,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:49,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:30:49,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:49,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:30:51,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 04:30:53,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:30:53,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:58,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:30:58,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=249426.66666666666, ans=0.0 2023-09-29 04:31:03,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:31:05,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=249426.66666666666, ans=0.125 2023-09-29 04:31:05,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=249426.66666666666, ans=0.125 2023-09-29 04:31:09,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:09,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:31:13,932 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=249493.33333333334, ans=0.0 2023-09-29 04:31:16,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:19,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 04:31:21,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:21,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:31:21,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:23,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:31:24,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 04:31:25,901 INFO [train.py:1039] (0/4) Epoch 8, batch 250, loss[loss=0.1846, simple_loss=0.2525, pruned_loss=0.05837, over 24339.00 frames. ], tot_loss[loss=0.2257, simple_loss=0.2899, pruned_loss=0.08072, over 3376366.28 frames. ], batch size: 56, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:31:25,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:31:26,021 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 04:31:28,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:29,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:31:32,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:32,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:34,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:31:36,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:36,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:31:39,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:31:50,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:31:55,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:55,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:32:03,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:32:03,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:32:04,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:32:05,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:05,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:32:05,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:32:05,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:10,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:32:13,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 04:32:13,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:32:16,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:32:16,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:32:16,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:32:17,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:17,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:32:17,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:32:20,842 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.201e+02 2.590e+02 2.939e+02 4.400e+02, threshold=5.181e+02, percent-clipped=0.0 2023-09-29 04:32:20,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:22,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:32:22,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:27,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:32:28,134 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=249760.0, ans=0.125 2023-09-29 04:32:33,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:36,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:32:41,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:42,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:32:46,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 04:32:47,831 INFO [train.py:1039] (0/4) Epoch 8, batch 300, loss[loss=0.1915, simple_loss=0.2647, pruned_loss=0.05917, over 19993.00 frames. ], tot_loss[loss=0.2227, simple_loss=0.2878, pruned_loss=0.07884, over 3672862.22 frames. ], batch size: 43, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:32:47,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:32:47,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:49,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 04:32:50,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:32:51,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:32:51,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 04:32:51,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=249893.33333333334, ans=0.0 2023-09-29 04:32:55,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:55,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:32:57,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=249893.33333333334, ans=0.125 2023-09-29 04:33:00,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:33:00,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 04:33:02,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:33:04,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:33:04,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 04:33:04,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:08,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:33:14,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:33:16,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 04:33:19,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 04:33:19,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:20,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:22,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:22,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 04:33:22,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:33:25,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:33:27,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:33:28,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:33:29,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=250026.66666666666, ans=0.0 2023-09-29 04:33:32,100 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=250026.66666666666, ans=0.0 2023-09-29 04:33:32,572 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.96 vs. limit=15.0 2023-09-29 04:33:33,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:33:33,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 04:33:33,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:33:36,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:38,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 04:33:40,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:33:42,340 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.41 vs. limit=15.0 2023-09-29 04:33:45,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:33:49,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:33:49,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 04:33:54,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:54,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:33:54,872 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=250160.0, ans=0.025 2023-09-29 04:33:56,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:57,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:33:57,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 04:33:57,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:33:59,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:01,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 04:34:01,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:34:02,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:04,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:04,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:05,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:10,634 INFO [train.py:1039] (0/4) Epoch 8, batch 350, loss[loss=0.1967, simple_loss=0.2705, pruned_loss=0.06147, over 24324.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2845, pruned_loss=0.0783, over 3896018.53 frames. ], batch size: 61, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:34:12,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:12,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:34:15,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:22,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:25,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:26,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:29,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 04:34:30,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:30,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 04:34:33,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:33,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 04:34:35,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:36,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 04:34:38,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:34:40,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:41,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:34:43,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:43,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:45,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:34:45,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:45,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:34:46,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:34:46,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:55,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:34:55,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:34:55,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:34:55,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:59,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 04:34:59,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:35:00,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=250426.66666666666, ans=0.1 2023-09-29 04:35:01,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=250426.66666666666, ans=0.125 2023-09-29 04:35:05,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:05,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:05,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:35:07,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 04:35:08,740 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.305e+02 2.882e+02 6.292e+02, threshold=4.610e+02, percent-clipped=1.0 2023-09-29 04:35:08,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:10,462 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 04:35:12,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 04:35:12,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:15,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:35:15,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 04:35:18,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:22,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:35:22,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:23,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=250493.33333333334, ans=0.125 2023-09-29 04:35:23,552 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.12 vs. limit=12.0 2023-09-29 04:35:24,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:24,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:28,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:31,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:35:34,509 INFO [train.py:1039] (0/4) Epoch 8, batch 400, loss[loss=0.2388, simple_loss=0.3046, pruned_loss=0.08646, over 23995.00 frames. ], tot_loss[loss=0.22, simple_loss=0.2848, pruned_loss=0.07763, over 4087756.95 frames. ], batch size: 86, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:35:34,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:35:34,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 04:35:36,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:36,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:36,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:35:37,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:39,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:39,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:41,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 04:35:43,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 04:35:43,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:44,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 04:35:46,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:47,992 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=250560.0, ans=0.1 2023-09-29 04:35:49,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:35:49,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:49,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 04:35:49,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:35:50,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:51,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:51,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:55,245 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 04:35:56,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 04:36:00,950 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=250626.66666666666, ans=0.125 2023-09-29 04:36:03,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:03,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:05,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 04:36:07,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 04:36:10,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:36:12,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:14,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=250693.33333333334, ans=0.0 2023-09-29 04:36:14,038 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=250693.33333333334, ans=0.2 2023-09-29 04:36:18,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=250693.33333333334, ans=0.125 2023-09-29 04:36:19,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 04:36:22,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:36:24,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 04:36:27,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:36:28,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:36:29,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 04:36:33,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:36:36,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:36:39,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:39,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=250826.66666666666, ans=0.125 2023-09-29 04:36:43,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:44,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 04:36:46,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:36:47,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 04:36:50,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:36:50,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:36:52,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 04:36:55,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:36:55,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:36:55,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:36:56,804 INFO [train.py:1039] (0/4) Epoch 8, batch 450, loss[loss=0.24, simple_loss=0.3125, pruned_loss=0.08373, over 24643.00 frames. ], tot_loss[loss=0.221, simple_loss=0.2862, pruned_loss=0.0779, over 4214641.15 frames. ], batch size: 73, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:36:57,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 04:36:57,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:36:58,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:59,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:36:59,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 04:36:59,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:37:01,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:37:03,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:37:16,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:16,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:18,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 04:37:18,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 04:37:23,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:37:26,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:28,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:31,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:31,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=251026.66666666666, ans=0.0 2023-09-29 04:37:32,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:35,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 04:37:35,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 04:37:38,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 04:37:38,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:37:38,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:40,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:37:43,855 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 04:37:43,869 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 04:37:45,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:47,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:37:49,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:37:52,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:37:52,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:37:54,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:37:54,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 04:37:56,017 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.144e+02 2.402e+02 2.848e+02 5.479e+02, threshold=4.804e+02, percent-clipped=2.0 2023-09-29 04:37:57,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:59,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:37:59,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:37:59,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 04:38:04,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:38:04,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 04:38:05,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 04:38:07,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:38:13,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:38:14,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:17,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:38:18,581 INFO [train.py:1039] (0/4) Epoch 8, batch 500, loss[loss=0.2278, simple_loss=0.2863, pruned_loss=0.08465, over 23509.00 frames. ], tot_loss[loss=0.2198, simple_loss=0.2855, pruned_loss=0.07704, over 4334666.59 frames. ], batch size: 285, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:38:18,687 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 04:38:22,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:23,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:38:24,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:24,309 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=251226.66666666666, ans=0.125 2023-09-29 04:38:25,489 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 04:38:25,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 04:38:25,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:29,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:38:29,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=251226.66666666666, ans=0.125 2023-09-29 04:38:32,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:38:35,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:38:35,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=251293.33333333334, ans=0.125 2023-09-29 04:38:37,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:38,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:39,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:38:49,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:49,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:38:49,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=251360.0, ans=0.035 2023-09-29 04:38:50,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:38:50,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:50,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 04:38:50,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:38:54,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:38:56,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:38:57,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:38:57,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:58,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 04:39:00,694 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 04:39:03,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:05,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:06,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:06,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:07,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:39:08,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=251426.66666666666, ans=0.125 2023-09-29 04:39:09,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 04:39:13,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:39:14,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:17,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:19,689 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:39:25,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:28,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 04:39:28,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:28,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:29,719 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.46 vs. limit=22.5 2023-09-29 04:39:32,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 04:39:33,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:39:35,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:41,376 INFO [train.py:1039] (0/4) Epoch 8, batch 550, loss[loss=0.2352, simple_loss=0.3066, pruned_loss=0.08192, over 24538.00 frames. ], tot_loss[loss=0.2216, simple_loss=0.2872, pruned_loss=0.07798, over 4402165.48 frames. ], batch size: 71, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:39:41,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 04:39:43,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 04:39:43,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:43,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 04:39:44,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:39:44,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:46,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:39:49,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:39:50,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:52,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 04:39:52,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:39:52,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=251560.0, ans=0.125 2023-09-29 04:39:54,450 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.95 vs. limit=6.0 2023-09-29 04:39:58,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:39:58,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:00,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:02,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:08,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 04:40:10,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 04:40:11,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:40:16,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:40:16,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:16,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:40:20,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:20,945 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 04:40:22,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:22,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=251693.33333333334, ans=0.04949747468305833 2023-09-29 04:40:23,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:40:25,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:25,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:40:27,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:40:27,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:28,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 04:40:30,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 04:40:30,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:32,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:32,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:40:32,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:40:36,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=251760.0, ans=0.1 2023-09-29 04:40:37,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:40:37,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:40:38,973 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.030e+02 2.358e+02 2.809e+02 4.445e+02, threshold=4.716e+02, percent-clipped=0.0 2023-09-29 04:40:40,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:40:41,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:42,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:40:43,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:40:44,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:46,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:40:46,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:49,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:40:49,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:40:55,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 04:40:57,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=251826.66666666666, ans=0.125 2023-09-29 04:40:58,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 04:40:59,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:40:59,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:41:01,243 INFO [train.py:1039] (0/4) Epoch 8, batch 600, loss[loss=0.2256, simple_loss=0.2837, pruned_loss=0.08377, over 16617.00 frames. ], tot_loss[loss=0.2224, simple_loss=0.2877, pruned_loss=0.0786, over 4449838.50 frames. ], batch size: 35, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:41:01,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:08,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:41:12,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:41:14,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 04:41:15,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:41:18,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:21,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:24,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 04:41:24,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:41:30,195 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.22 vs. limit=15.0 2023-09-29 04:41:30,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 04:41:33,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:41:33,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:33,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:41:40,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:41:40,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:41:43,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:45,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=252026.66666666666, ans=0.1 2023-09-29 04:41:51,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:41:56,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:56,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:56,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:42:00,238 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.87 vs. limit=22.5 2023-09-29 04:42:02,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 04:42:07,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:42:08,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:12,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 04:42:12,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:42:15,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 04:42:15,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:42:16,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:42:17,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=252160.0, ans=0.125 2023-09-29 04:42:22,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:42:25,708 INFO [train.py:1039] (0/4) Epoch 8, batch 650, loss[loss=0.2444, simple_loss=0.2954, pruned_loss=0.09666, over 23821.00 frames. ], tot_loss[loss=0.2221, simple_loss=0.2869, pruned_loss=0.07863, over 4506231.27 frames. ], batch size: 180, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:42:25,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:42:28,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:42:29,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:42:31,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:34,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 04:42:35,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:42:40,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:42:40,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:43,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:46,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 04:42:48,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:42:48,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:52,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:54,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:42:55,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:57,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:59,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:42:59,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:02,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:43:04,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=252360.0, ans=0.2 2023-09-29 04:43:05,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:43:05,717 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 04:43:05,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:05,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:10,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:11,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:11,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:11,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:43:13,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 04:43:14,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:43:14,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:43:16,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:43:16,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:17,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:43:19,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 04:43:19,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 04:43:19,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:19,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:21,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:43:21,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:43:23,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:43:26,229 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 2.104e+02 2.347e+02 2.945e+02 4.272e+02, threshold=4.693e+02, percent-clipped=0.0 2023-09-29 04:43:31,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:32,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:43:34,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:36,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:37,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:43:38,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:46,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:43:46,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:47,894 INFO [train.py:1039] (0/4) Epoch 8, batch 700, loss[loss=0.2119, simple_loss=0.2937, pruned_loss=0.06505, over 24321.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2847, pruned_loss=0.07795, over 4540649.28 frames. ], batch size: 74, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:43:47,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:43:48,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:51,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=252560.0, ans=0.125 2023-09-29 04:43:52,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 04:43:52,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 04:43:54,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 04:43:56,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:58,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=252560.0, ans=0.125 2023-09-29 04:43:59,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:44:01,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 04:44:06,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:09,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:44:11,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:13,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:44:13,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:44:16,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:19,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 04:44:19,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:44:21,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 04:44:22,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 04:44:25,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:44:26,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:44:26,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=252693.33333333334, ans=0.2 2023-09-29 04:44:27,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:44:32,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:44:34,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 04:44:36,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=252760.0, ans=0.125 2023-09-29 04:44:38,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:38,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:44:38,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 04:44:43,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:45,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:48,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:44:53,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:44:53,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 04:44:56,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 04:44:56,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 04:44:59,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:44:59,992 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=252826.66666666666, ans=0.125 2023-09-29 04:45:03,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:04,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:05,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:06,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 04:45:11,584 INFO [train.py:1039] (0/4) Epoch 8, batch 750, loss[loss=0.2177, simple_loss=0.302, pruned_loss=0.0667, over 24278.00 frames. ], tot_loss[loss=0.2204, simple_loss=0.2851, pruned_loss=0.07783, over 4585967.74 frames. ], batch size: 74, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:45:11,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 04:45:11,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 04:45:11,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 04:45:13,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 04:45:13,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 04:45:14,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:45:16,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 04:45:17,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:17,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:19,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:21,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:21,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:45:21,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:24,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:45:26,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:45:27,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=252960.0, ans=0.125 2023-09-29 04:45:28,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:45:31,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:33,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:34,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 04:45:35,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:45:36,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:37,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:39,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:45:40,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 04:45:40,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:42,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 04:45:44,036 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 04:45:45,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 04:45:45,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:45:45,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:45:47,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:45:55,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:55,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:45:55,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:45:58,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:58,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:00,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 04:46:00,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:46:01,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:46:03,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:46:08,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:46:08,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 04:46:08,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:11,300 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 2.056e+02 2.287e+02 2.694e+02 4.439e+02, threshold=4.575e+02, percent-clipped=0.0 2023-09-29 04:46:13,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:13,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:46:15,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:18,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:46:22,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 04:46:22,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=253160.0, ans=0.125 2023-09-29 04:46:24,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:24,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:31,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:32,923 INFO [train.py:1039] (0/4) Epoch 8, batch 800, loss[loss=0.1811, simple_loss=0.2469, pruned_loss=0.05761, over 24307.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2861, pruned_loss=0.07755, over 4621336.98 frames. ], batch size: 56, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:46:32,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:46:42,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:42,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:43,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:44,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:44,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=253226.66666666666, ans=0.125 2023-09-29 04:46:46,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:46,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:47,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:53,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:54,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:46:56,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 04:46:56,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:59,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:59,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:46:59,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:46:59,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 04:46:59,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:01,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 04:47:04,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:07,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:08,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:47:08,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:47:13,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:13,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:16,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:47:18,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:47:18,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:47:20,163 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 04:47:20,706 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=15.0 2023-09-29 04:47:21,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 04:47:21,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:47:21,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:47:23,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:23,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:47:28,448 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 04:47:29,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 04:47:31,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:47:33,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:47:33,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=253426.66666666666, ans=0.0 2023-09-29 04:47:36,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:47:40,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:41,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 04:47:42,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:47:44,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 04:47:52,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:47:56,239 INFO [train.py:1039] (0/4) Epoch 8, batch 850, loss[loss=0.2267, simple_loss=0.303, pruned_loss=0.07521, over 24651.00 frames. ], tot_loss[loss=0.223, simple_loss=0.2886, pruned_loss=0.07866, over 4628860.64 frames. ], batch size: 73, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:47:56,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:47:56,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 04:47:57,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:47:59,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:59,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 04:48:00,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:02,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:48:03,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:05,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:48:06,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:48:08,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 04:48:08,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 04:48:08,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 04:48:08,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=253560.0, ans=0.125 2023-09-29 04:48:10,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:48:10,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:48:12,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:14,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:48:14,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:48:20,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:20,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:20,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 04:48:24,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 04:48:26,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:28,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 04:48:32,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 04:48:34,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 04:48:37,096 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 04:48:37,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:37,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:48:37,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:48:40,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 04:48:43,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:45,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:46,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=253760.0, ans=0.2 2023-09-29 04:48:47,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:48:47,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:48:50,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:48:51,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=253760.0, ans=0.0 2023-09-29 04:48:52,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:48:52,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 04:48:54,128 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=253760.0, ans=0.125 2023-09-29 04:48:56,831 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.108e+02 2.249e+02 2.560e+02 3.769e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 04:48:57,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:48:57,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:48:57,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=253760.0, ans=0.125 2023-09-29 04:48:58,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:48:58,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:49:00,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:01,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:49:05,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:49:06,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:49:08,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:08,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:49:15,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=253826.66666666666, ans=0.1 2023-09-29 04:49:17,884 INFO [train.py:1039] (0/4) Epoch 8, batch 900, loss[loss=0.2183, simple_loss=0.2833, pruned_loss=0.07664, over 24431.00 frames. ], tot_loss[loss=0.2229, simple_loss=0.2887, pruned_loss=0.07861, over 4661884.63 frames. ], batch size: 58, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:49:18,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:49:20,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:49:20,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 04:49:20,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:20,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:49:21,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 04:49:28,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:49:32,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=253893.33333333334, ans=0.2 2023-09-29 04:49:33,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:33,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 04:49:35,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=253960.0, ans=0.125 2023-09-29 04:49:36,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:49:36,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 04:49:36,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=253960.0, ans=0.125 2023-09-29 04:49:38,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:49:40,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:40,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:40,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:49:40,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:49:40,790 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.81 vs. limit=22.5 2023-09-29 04:49:50,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:50,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:50,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:49:52,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:58,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 04:50:00,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:50:04,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:50:04,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:50:05,760 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 04:50:05,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 04:50:15,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:50:15,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:50:15,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:50:18,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=254093.33333333334, ans=0.0 2023-09-29 04:50:21,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:21,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:50:25,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 04:50:25,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:50:25,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=254160.0, ans=0.07 2023-09-29 04:50:28,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 04:50:29,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:50:29,949 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=254160.0, ans=0.125 2023-09-29 04:50:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:31,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:50:31,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:50:36,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 04:50:36,833 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 04:50:38,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:50:38,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 04:50:41,192 INFO [train.py:1039] (0/4) Epoch 8, batch 950, loss[loss=0.266, simple_loss=0.3084, pruned_loss=0.1118, over 19952.00 frames. ], tot_loss[loss=0.2237, simple_loss=0.2893, pruned_loss=0.07907, over 4657861.68 frames. ], batch size: 388, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:50:41,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:46,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 04:50:51,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:50:52,342 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=254226.66666666666, ans=0.125 2023-09-29 04:50:53,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:53,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:55,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:50:56,833 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 04:50:57,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=254293.33333333334, ans=0.05 2023-09-29 04:51:01,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:02,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:03,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:04,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:51:04,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 04:51:04,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:51:05,174 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=254293.33333333334, ans=0.2 2023-09-29 04:51:07,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:08,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 04:51:08,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:12,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:12,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:13,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:51:13,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 04:51:15,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:51:18,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:20,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:51:24,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:51:24,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:27,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 04:51:29,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 04:51:29,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:51:29,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:31,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:31,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:51:37,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 04:51:39,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:51:41,993 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.975e+02 2.273e+02 2.583e+02 4.078e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 04:51:42,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:42,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:42,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 04:51:42,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:42,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:51:42,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 04:51:48,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:51:52,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:56,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=254493.33333333334, ans=0.125 2023-09-29 04:51:58,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=254493.33333333334, ans=0.125 2023-09-29 04:51:59,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:51:59,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 04:52:01,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 04:52:03,997 INFO [train.py:1039] (0/4) Epoch 8, batch 1000, loss[loss=0.2066, simple_loss=0.2768, pruned_loss=0.06824, over 24315.00 frames. ], tot_loss[loss=0.2224, simple_loss=0.2881, pruned_loss=0.07841, over 4671810.30 frames. ], batch size: 61, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:52:04,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:52:07,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 04:52:09,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:15,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:52:16,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 04:52:16,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 04:52:20,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=254626.66666666666, ans=0.125 2023-09-29 04:52:22,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:22,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:52:22,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=254626.66666666666, ans=0.125 2023-09-29 04:52:23,136 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.94 vs. limit=22.5 2023-09-29 04:52:23,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:24,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=254626.66666666666, ans=0.125 2023-09-29 04:52:27,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 04:52:31,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 04:52:32,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 04:52:32,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:34,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 04:52:37,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 04:52:37,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 04:52:39,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:40,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:48,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:50,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:52:50,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:51,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:51,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 04:52:51,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:54,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:52:54,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:54,413 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 04:52:58,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 04:52:58,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 04:53:00,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 04:53:02,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:53:09,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:09,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:53:10,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:10,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:53:13,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 04:53:15,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:53:15,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 04:53:16,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=254826.66666666666, ans=0.125 2023-09-29 04:53:17,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 04:53:17,753 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=254826.66666666666, ans=15.0 2023-09-29 04:53:19,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:19,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:53:22,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:53:22,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:53:23,145 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.48 vs. limit=15.0 2023-09-29 04:53:25,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:53:27,387 INFO [train.py:1039] (0/4) Epoch 8, batch 1050, loss[loss=0.234, simple_loss=0.3015, pruned_loss=0.08322, over 23363.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.2853, pruned_loss=0.0779, over 4670973.61 frames. ], batch size: 93, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:53:30,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:53:30,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:53:32,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:53:33,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:33,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=254893.33333333334, ans=0.125 2023-09-29 04:53:38,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:53:39,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:53:41,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:53:42,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:53:44,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:53:44,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:53:44,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:53:46,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 04:53:46,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=254960.0, ans=0.125 2023-09-29 04:53:47,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:53:47,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 04:53:50,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:50,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 04:53:50,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:53:57,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:59,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:54:00,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:54:03,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 04:54:03,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 04:54:03,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:54:07,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 04:54:10,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 04:54:12,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:14,703 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=255026.66666666666, ans=0.0 2023-09-29 04:54:15,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:54:19,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 04:54:19,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:54:19,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:54:22,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:54:24,302 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.63 vs. limit=15.0 2023-09-29 04:54:27,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 04:54:28,812 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.059e+02 2.267e+02 2.771e+02 5.438e+02, threshold=4.534e+02, percent-clipped=2.0 2023-09-29 04:54:28,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 04:54:29,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 04:54:30,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:30,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:54:32,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 04:54:37,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:54:38,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:38,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:54:38,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:38,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:42,335 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:54:42,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=255160.0, ans=0.125 2023-09-29 04:54:44,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:44,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 04:54:46,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:46,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 04:54:47,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 04:54:47,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:54:50,436 INFO [train.py:1039] (0/4) Epoch 8, batch 1100, loss[loss=0.2339, simple_loss=0.3052, pruned_loss=0.08134, over 23602.00 frames. ], tot_loss[loss=0.2199, simple_loss=0.2856, pruned_loss=0.07708, over 4691122.41 frames. ], batch size: 85, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:54:50,834 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=255226.66666666666, ans=0.2 2023-09-29 04:54:52,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:54:56,251 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.76 vs. limit=12.0 2023-09-29 04:54:56,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:55:00,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:55:01,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=255226.66666666666, ans=0.2 2023-09-29 04:55:03,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:55:03,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:03,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 04:55:05,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:55:05,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=255293.33333333334, ans=0.125 2023-09-29 04:55:08,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:55:10,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:55:13,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:55:15,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 04:55:15,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:55:17,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:17,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:55:20,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:55:22,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=255360.0, ans=0.125 2023-09-29 04:55:23,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:55:25,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=255360.0, ans=0.125 2023-09-29 04:55:29,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:55:32,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 04:55:34,456 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 04:55:35,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:37,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:39,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:55:40,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:55:42,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 04:55:43,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:55:43,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:55:43,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:55:44,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:45,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 04:55:45,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=255426.66666666666, ans=0.0 2023-09-29 04:55:50,233 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.42 vs. limit=15.0 2023-09-29 04:55:50,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:55:50,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 04:55:54,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:55:59,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:56:00,441 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.98 vs. limit=22.5 2023-09-29 04:56:01,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 04:56:01,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:56:02,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:05,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:05,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:07,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 04:56:08,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:56:08,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:10,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 04:56:10,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:56:11,506 INFO [train.py:1039] (0/4) Epoch 8, batch 1150, loss[loss=0.2096, simple_loss=0.2739, pruned_loss=0.07263, over 24452.00 frames. ], tot_loss[loss=0.2191, simple_loss=0.2853, pruned_loss=0.0765, over 4703589.65 frames. ], batch size: 58, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:56:11,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 04:56:12,339 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.18 vs. limit=15.0 2023-09-29 04:56:13,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:56:13,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:56:15,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:56:19,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:21,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:56:25,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:25,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:56:25,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 04:56:26,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:27,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=255626.66666666666, ans=0.1 2023-09-29 04:56:29,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 04:56:32,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:32,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:56:36,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 04:56:38,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:40,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=255626.66666666666, ans=0.125 2023-09-29 04:56:41,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:43,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:56:43,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 04:56:43,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:56:44,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:48,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 04:56:49,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:50,014 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:56:51,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:52,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=255693.33333333334, ans=0.125 2023-09-29 04:57:00,498 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.57 vs. limit=15.0 2023-09-29 04:57:01,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=255760.0, ans=0.1 2023-09-29 04:57:03,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:09,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:09,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 04:57:11,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:11,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:12,648 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 2.096e+02 2.373e+02 2.802e+02 4.520e+02, threshold=4.746e+02, percent-clipped=0.0 2023-09-29 04:57:17,537 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 04:57:19,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:27,170 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 04:57:30,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:32,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:57:32,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:57:33,614 INFO [train.py:1039] (0/4) Epoch 8, batch 1200, loss[loss=0.2083, simple_loss=0.2792, pruned_loss=0.06868, over 24485.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.2855, pruned_loss=0.07652, over 4696708.20 frames. ], batch size: 63, lr: 1.32e-02, grad_scale: 32.0 2023-09-29 04:57:33,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:57:37,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:42,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:57:43,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:57:45,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:57:45,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:45,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:57:46,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:57:48,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:57:49,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:49,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:51,414 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 04:57:56,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 04:58:00,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:58:00,502 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=255960.0, ans=0.125 2023-09-29 04:58:03,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:58:04,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:06,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:06,959 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 04:58:08,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:18,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:58:18,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:58:18,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 04:58:19,522 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.97 vs. limit=15.0 2023-09-29 04:58:20,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:58:23,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 04:58:26,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 04:58:26,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:28,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:58:29,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:29,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:58:31,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:31,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:58:33,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:58:33,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 04:58:33,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:58:35,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:35,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:58:38,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:58:38,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:43,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:58:46,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:58:48,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 04:58:50,442 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 04:58:53,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:56,297 INFO [train.py:1039] (0/4) Epoch 8, batch 1250, loss[loss=0.2335, simple_loss=0.3005, pruned_loss=0.0832, over 23353.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2863, pruned_loss=0.07694, over 4700697.39 frames. ], batch size: 93, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 04:58:56,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:56,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:58:59,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:59:01,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 04:59:06,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:59:08,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:08,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 04:59:11,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:59:11,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:59:15,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:59:17,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:18,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:59:18,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:20,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:59:25,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:59:25,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:59:25,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:59:25,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:27,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:27,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=256293.33333333334, ans=0.125 2023-09-29 04:59:30,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:31,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:59:36,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 04:59:36,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:59:38,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=256360.0, ans=0.125 2023-09-29 04:59:40,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:59:41,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 04:59:43,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:43,055 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 04:59:43,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:43,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:46,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:50,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:51,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:59:52,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 04:59:52,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 04:59:52,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 04:59:56,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:59:58,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 04:59:58,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:01,788 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.911e+02 2.146e+02 2.412e+02 3.765e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 05:00:03,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:00:03,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:00:03,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=256493.33333333334, ans=0.125 2023-09-29 05:00:05,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 05:00:05,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:00:05,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:00:05,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:00:06,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:08,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 05:00:08,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=256493.33333333334, ans=0.2 2023-09-29 05:00:10,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:12,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:00:13,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:00:16,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:00:18,719 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.75 vs. limit=15.0 2023-09-29 05:00:19,988 INFO [train.py:1039] (0/4) Epoch 8, batch 1300, loss[loss=0.2171, simple_loss=0.2677, pruned_loss=0.08325, over 23833.00 frames. ], tot_loss[loss=0.2214, simple_loss=0.2873, pruned_loss=0.07778, over 4695755.67 frames. ], batch size: 164, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:00:21,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:22,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 05:00:22,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=256560.0, ans=0.0 2023-09-29 05:00:25,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:26,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:00:26,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:00:29,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:31,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:00:31,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=256560.0, ans=0.2 2023-09-29 05:00:33,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 05:00:39,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:00:39,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:00:39,646 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=256626.66666666666, ans=0.125 2023-09-29 05:00:42,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 05:00:42,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=256626.66666666666, ans=0.0 2023-09-29 05:00:44,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:00:49,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:50,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:51,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:53,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:54,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:00:54,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:00:56,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 05:01:02,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:01:02,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:01:04,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 05:01:06,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:01:09,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:01:09,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:01:10,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 05:01:11,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:12,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 05:01:13,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:17,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:01:17,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:01:22,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 05:01:22,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 05:01:23,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 05:01:27,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:01:30,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 05:01:34,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:37,689 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=256826.66666666666, ans=0.125 2023-09-29 05:01:42,297 INFO [train.py:1039] (0/4) Epoch 8, batch 1350, loss[loss=0.1973, simple_loss=0.2765, pruned_loss=0.05903, over 24342.00 frames. ], tot_loss[loss=0.2207, simple_loss=0.2862, pruned_loss=0.07765, over 4680523.13 frames. ], batch size: 61, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:01:42,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 05:01:46,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:48,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:01:51,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:51,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:54,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:01:54,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:01:58,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:02:00,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 05:02:02,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:03,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:02:04,321 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:02:05,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 05:02:05,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:02:06,468 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=10.39 vs. limit=10.0 2023-09-29 05:02:07,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:02:07,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 05:02:10,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 05:02:12,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 05:02:14,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:14,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 05:02:27,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:37,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:37,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:37,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 05:02:40,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:44,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 05:02:44,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:45,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:02:46,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=257093.33333333334, ans=0.0 2023-09-29 05:02:47,102 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.144e+02 2.487e+02 2.898e+02 4.537e+02, threshold=4.974e+02, percent-clipped=1.0 2023-09-29 05:02:48,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:02:51,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 05:02:53,167 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.74 vs. limit=15.0 2023-09-29 05:02:53,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:02:58,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 05:02:59,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=257160.0, ans=0.0 2023-09-29 05:03:00,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 05:03:05,411 INFO [train.py:1039] (0/4) Epoch 8, batch 1400, loss[loss=0.2256, simple_loss=0.2824, pruned_loss=0.08441, over 23678.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2847, pruned_loss=0.07656, over 4694535.48 frames. ], batch size: 232, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:03:08,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 05:03:10,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:03:12,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:03:13,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:03:18,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 05:03:22,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 05:03:26,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=257293.33333333334, ans=0.125 2023-09-29 05:03:31,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:03:32,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:34,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:03:34,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:03:38,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:03:38,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 05:03:48,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:50,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:54,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 05:03:54,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:03:56,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:03:56,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:03:57,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:59,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:03:59,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:04:01,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:04:01,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 05:04:02,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:04:03,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=257426.66666666666, ans=0.125 2023-09-29 05:04:07,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:10,231 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=257426.66666666666, ans=0.2 2023-09-29 05:04:11,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:04:19,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 05:04:21,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:04:21,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:04:24,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 05:04:24,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:28,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:04:29,511 INFO [train.py:1039] (0/4) Epoch 8, batch 1450, loss[loss=0.202, simple_loss=0.2588, pruned_loss=0.07261, over 23376.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.2839, pruned_loss=0.07588, over 4709774.13 frames. ], batch size: 285, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:04:31,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:04:35,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:04:35,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:35,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:04:39,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:41,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:04:42,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:04:42,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 05:04:44,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:04:46,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 05:04:46,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:48,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:48,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 05:04:50,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:04:50,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:04:50,643 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.16 vs. limit=15.0 2023-09-29 05:04:51,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 05:04:52,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:53,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=257626.66666666666, ans=0.125 2023-09-29 05:04:53,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=257626.66666666666, ans=0.0 2023-09-29 05:04:54,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:04:55,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:59,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:01,477 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.39 vs. limit=10.0 2023-09-29 05:05:02,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:05:02,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:05:05,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:05:05,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:10,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:05:10,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:12,828 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=257693.33333333334, ans=0.1 2023-09-29 05:05:13,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 05:05:15,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:05:17,443 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=257760.0, ans=0.1 2023-09-29 05:05:20,607 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 05:05:22,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:22,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:05:22,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=257760.0, ans=0.1 2023-09-29 05:05:25,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:26,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=257760.0, ans=0.1 2023-09-29 05:05:27,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 05:05:30,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:32,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 05:05:33,284 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.099e+02 2.346e+02 2.740e+02 3.754e+02, threshold=4.692e+02, percent-clipped=0.0 2023-09-29 05:05:33,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 05:05:35,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:36,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:05:37,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=257826.66666666666, ans=0.125 2023-09-29 05:05:38,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:40,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 05:05:44,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 05:05:44,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 05:05:46,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:48,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:05:51,398 INFO [train.py:1039] (0/4) Epoch 8, batch 1500, loss[loss=0.2297, simple_loss=0.305, pruned_loss=0.07716, over 24319.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.2845, pruned_loss=0.07651, over 4711213.05 frames. ], batch size: 77, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:05:58,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 05:06:00,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:06:00,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:06:01,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:02,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:03,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:06:05,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 05:06:05,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:06:06,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:06:06,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:08,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:06:09,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:06:11,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 05:06:16,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:06:16,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:06:16,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:22,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 05:06:27,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 05:06:28,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:28,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 05:06:31,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:06:33,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:06:33,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:33,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:06:36,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 05:06:36,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:06:38,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:38,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 05:06:39,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:44,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:06:44,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 05:06:48,189 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.13 vs. limit=15.0 2023-09-29 05:06:52,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:06:53,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:06:59,008 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 05:07:00,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:00,519 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 05:07:02,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:02,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:02,266 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 05:07:05,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:07:07,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 05:07:11,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:12,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:12,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:14,175 INFO [train.py:1039] (0/4) Epoch 8, batch 1550, loss[loss=0.2133, simple_loss=0.2818, pruned_loss=0.07242, over 23438.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.2852, pruned_loss=0.07667, over 4718985.52 frames. ], batch size: 106, lr: 1.31e-02, grad_scale: 4.0 2023-09-29 05:07:14,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:14,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:15,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:07:17,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 05:07:18,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 05:07:18,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:07:19,196 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=258226.66666666666, ans=0.0 2023-09-29 05:07:20,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 05:07:20,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 05:07:22,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:23,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:23,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:25,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:07:27,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:27,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:29,093 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 05:07:29,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:31,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:07:31,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:07:32,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:07:32,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 05:07:34,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:36,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 05:07:36,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 05:07:36,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 05:07:38,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:40,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:43,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:43,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 05:07:43,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 05:07:53,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:57,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:57,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:07:58,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:07:58,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 05:08:04,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:08:06,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:10,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:08:13,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:08:15,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:08:15,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 05:08:15,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:18,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:08:18,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:19,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:08:19,656 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 05:08:20,934 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.986e+02 2.177e+02 2.778e+02 5.075e+02, threshold=4.355e+02, percent-clipped=1.0 2023-09-29 05:08:21,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:27,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 05:08:31,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:31,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:33,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 05:08:36,171 INFO [train.py:1039] (0/4) Epoch 8, batch 1600, loss[loss=0.2239, simple_loss=0.2861, pruned_loss=0.0808, over 23402.00 frames. ], tot_loss[loss=0.2216, simple_loss=0.2869, pruned_loss=0.07813, over 4707112.89 frames. ], batch size: 119, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:08:37,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:38,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:38,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:08:38,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:08:39,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:08:43,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:43,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 05:08:45,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 05:08:46,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 05:08:48,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=258560.0, ans=0.125 2023-09-29 05:08:50,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:08:52,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 05:08:53,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:08:55,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:08:59,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:09:02,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 05:09:06,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:09:07,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 05:09:07,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:09,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 05:09:15,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 05:09:20,964 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.26 vs. limit=10.0 2023-09-29 05:09:24,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:24,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 05:09:25,597 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=12.0 2023-09-29 05:09:26,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:26,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:09:26,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:09:29,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:09:33,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:09:34,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:09:36,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:09:37,337 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.70 vs. limit=10.0 2023-09-29 05:09:39,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:09:40,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:09:41,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:09:49,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:49,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:09:52,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 05:09:52,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:09:53,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 05:09:58,308 INFO [train.py:1039] (0/4) Epoch 8, batch 1650, loss[loss=0.1917, simple_loss=0.2661, pruned_loss=0.05859, over 24580.00 frames. ], tot_loss[loss=0.2213, simple_loss=0.2869, pruned_loss=0.07786, over 4702171.20 frames. ], batch size: 60, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:09:59,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:01,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:01,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:10:01,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 05:10:01,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 05:10:01,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 05:10:03,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 05:10:07,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:10:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:07,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:07,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:10:10,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:12,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=258960.0, ans=0.0 2023-09-29 05:10:13,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 05:10:16,302 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.39 vs. limit=12.0 2023-09-29 05:10:16,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:10:16,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:16,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:10:16,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:10:16,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 05:10:16,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 05:10:21,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=258960.0, ans=0.0 2023-09-29 05:10:22,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:10:24,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=258960.0, ans=0.0 2023-09-29 05:10:25,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:10:32,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=259026.66666666666, ans=0.1 2023-09-29 05:10:35,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 05:10:35,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:37,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 05:10:39,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=259026.66666666666, ans=0.0 2023-09-29 05:10:41,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:10:42,721 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.34 vs. limit=15.0 2023-09-29 05:10:43,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:10:43,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:10:43,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=259026.66666666666, ans=0.125 2023-09-29 05:10:44,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:10:46,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:47,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:50,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:51,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:10:53,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:54,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:10:57,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=259093.33333333334, ans=0.125 2023-09-29 05:10:58,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:59,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 05:11:00,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=259093.33333333334, ans=15.0 2023-09-29 05:11:01,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:11:02,717 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.991e+02 2.189e+02 2.754e+02 4.240e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 05:11:02,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 05:11:03,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 05:11:03,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 05:11:03,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:04,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:11:04,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:06,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:11:06,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 05:11:09,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:11,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:11:11,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:14,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 05:11:18,742 INFO [train.py:1039] (0/4) Epoch 8, batch 1700, loss[loss=0.2296, simple_loss=0.3038, pruned_loss=0.07772, over 24636.00 frames. ], tot_loss[loss=0.2214, simple_loss=0.2862, pruned_loss=0.07836, over 4699062.67 frames. ], batch size: 68, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:11:18,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:18,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:11:18,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 05:11:19,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:19,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:11:19,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:25,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:11:25,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:11:25,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 05:11:28,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:11:37,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:40,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:11:45,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=259293.33333333334, ans=0.2 2023-09-29 05:11:47,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:11:47,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:11:48,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:48,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:11:51,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 05:11:53,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:11:53,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:54,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:11:56,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:11:57,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 05:11:58,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 05:12:00,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:01,697 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.36 vs. limit=15.0 2023-09-29 05:12:02,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 05:12:03,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:12:11,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=259426.66666666666, ans=0.0 2023-09-29 05:12:12,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:14,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:14,698 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.58 vs. limit=22.5 2023-09-29 05:12:15,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:12:15,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:12:15,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 05:12:17,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:12:18,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:18,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 05:12:18,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:12:18,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:20,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:20,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:22,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:22,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:12:24,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:25,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:12:25,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:29,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:29,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 05:12:33,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:35,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:38,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 05:12:38,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=259493.33333333334, ans=0.1 2023-09-29 05:12:38,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=259493.33333333334, ans=0.125 2023-09-29 05:12:41,308 INFO [train.py:1039] (0/4) Epoch 8, batch 1750, loss[loss=0.2072, simple_loss=0.2719, pruned_loss=0.07127, over 23624.00 frames. ], tot_loss[loss=0.2195, simple_loss=0.2835, pruned_loss=0.07773, over 4684216.09 frames. ], batch size: 149, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:12:43,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:45,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:45,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:12:45,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=259560.0, ans=0.0 2023-09-29 05:12:47,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 05:12:47,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:51,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:12:51,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:51,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=259560.0, ans=0.1 2023-09-29 05:12:54,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 05:12:57,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:00,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 05:13:00,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:02,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:13:05,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:13:05,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 05:13:08,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:13:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 05:13:19,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:13:22,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:13:22,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:27,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:27,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:29,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:13:30,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:33,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:33,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:35,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 05:13:36,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:40,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 05:13:40,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:41,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:42,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:13:47,666 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.975e+02 2.294e+02 2.712e+02 4.778e+02, threshold=4.588e+02, percent-clipped=2.0 2023-09-29 05:13:47,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:13:47,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 05:13:49,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:49,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:56,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:58,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:00,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:14:02,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 05:14:02,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:03,443 INFO [train.py:1039] (0/4) Epoch 8, batch 1800, loss[loss=0.2397, simple_loss=0.3102, pruned_loss=0.08464, over 23935.00 frames. ], tot_loss[loss=0.2191, simple_loss=0.2835, pruned_loss=0.07738, over 4693324.43 frames. ], batch size: 80, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:14:03,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:14:03,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:03,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:14:03,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:14:05,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:14:08,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:14:09,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:14:11,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:14:14,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:17,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:14:19,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:14:23,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:24,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=259960.0, ans=0.07 2023-09-29 05:14:25,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:26,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:28,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:14:30,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:30,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 05:14:31,168 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.10 vs. limit=10.0 2023-09-29 05:14:31,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:34,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:38,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 05:14:41,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 05:14:41,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 05:14:41,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:41,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:41,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:14:42,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:14:43,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=260026.66666666666, ans=0.2 2023-09-29 05:14:51,122 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 05:14:51,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:14:52,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:53,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=260093.33333333334, ans=0.125 2023-09-29 05:14:55,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 05:14:55,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 05:14:57,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:14:59,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:15:00,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:15:06,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 05:15:10,242 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.12 vs. limit=15.0 2023-09-29 05:15:12,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:15:12,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 05:15:14,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:15:14,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:14,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:15:14,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 05:15:17,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:15:17,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:20,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 05:15:20,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:22,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:22,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:15:22,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:15:24,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=260226.66666666666, ans=0.0 2023-09-29 05:15:25,471 INFO [train.py:1039] (0/4) Epoch 8, batch 1850, loss[loss=0.1949, simple_loss=0.2765, pruned_loss=0.0566, over 24403.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2849, pruned_loss=0.0777, over 4695806.24 frames. ], batch size: 77, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:15:27,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:15:27,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:30,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:15:32,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:15:41,250 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=260293.33333333334, ans=0.0 2023-09-29 05:15:42,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:15:42,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 05:15:45,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 05:15:48,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 05:15:49,781 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.94 vs. limit=15.0 2023-09-29 05:15:52,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:52,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 05:15:52,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 05:16:02,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:16:03,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 05:16:07,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:07,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:12,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 05:16:14,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:14,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:16:15,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:16:17,499 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=260426.66666666666, ans=0.0 2023-09-29 05:16:18,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:16:21,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:16:23,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=260426.66666666666, ans=0.0 2023-09-29 05:16:24,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:16:24,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:24,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:16:26,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:16:27,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:29,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:16:30,632 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.959e+02 2.142e+02 2.407e+02 4.178e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 05:16:32,653 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.650e-03 2023-09-29 05:16:33,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 05:16:35,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:38,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:16:39,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:16:39,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 05:16:39,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 05:16:42,829 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 05:16:45,000 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 05:16:45,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:16:46,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:46,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:16:46,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:47,976 INFO [train.py:1039] (0/4) Epoch 8, batch 1900, loss[loss=0.2182, simple_loss=0.276, pruned_loss=0.08018, over 23627.00 frames. ], tot_loss[loss=0.2196, simple_loss=0.285, pruned_loss=0.07711, over 4705572.57 frames. ], batch size: 149, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:16:48,086 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 05:16:48,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:16:48,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:49,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:16:51,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:16:51,326 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=260560.0, ans=0.1 2023-09-29 05:16:51,481 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=260560.0, ans=0.125 2023-09-29 05:16:52,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:52,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 05:16:54,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:54,275 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 05:16:54,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:16:54,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=260560.0, ans=0.0 2023-09-29 05:16:55,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:17:00,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:17:03,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:17:03,825 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 05:17:05,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 05:17:05,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:17:06,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:17:06,884 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 05:17:08,337 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 05:17:09,131 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.63 vs. limit=15.0 2023-09-29 05:17:11,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 05:17:13,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:17:19,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 05:17:22,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 05:17:23,481 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.41 vs. limit=15.0 2023-09-29 05:17:31,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 05:17:33,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 05:17:33,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:17:33,497 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 05:17:33,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 05:17:33,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 05:17:33,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 05:17:33,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:17:33,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=260693.33333333334, ans=0.125 2023-09-29 05:17:38,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 05:17:41,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:17:44,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:17:44,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 05:17:46,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:17:51,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 05:17:51,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:17:57,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:17:57,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:17:57,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:17:59,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:18:00,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:18:00,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:18:02,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:18:02,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=260826.66666666666, ans=0.125 2023-09-29 05:18:05,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:05,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:08,563 INFO [train.py:1039] (0/4) Epoch 8, batch 1950, loss[loss=0.2041, simple_loss=0.266, pruned_loss=0.07114, over 19785.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.2854, pruned_loss=0.0765, over 4710520.27 frames. ], batch size: 43, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:18:08,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:18:08,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:08,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:18:11,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:13,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:16,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:18:16,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:16,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:18:19,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 05:18:19,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:18:21,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:23,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:27,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:18:27,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:27,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:28,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:18:30,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:30,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:18:30,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:18:31,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:38,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:38,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:38,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:18:38,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 05:18:38,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:18:39,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:18:39,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:43,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=261026.66666666666, ans=0.2 2023-09-29 05:18:44,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:47,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:50,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=261026.66666666666, ans=0.1 2023-09-29 05:18:51,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:18:55,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:18:55,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:18:55,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 05:18:56,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:18:57,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=261093.33333333334, ans=0.0 2023-09-29 05:19:00,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:19:01,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:19:02,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:04,217 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.27 vs. limit=22.5 2023-09-29 05:19:09,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=261093.33333333334, ans=0.1 2023-09-29 05:19:12,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:14,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:15,359 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.025e+02 2.337e+02 2.726e+02 4.544e+02, threshold=4.674e+02, percent-clipped=3.0 2023-09-29 05:19:17,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:19,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:21,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:19:23,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:25,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 05:19:25,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:19:26,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:19:26,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 05:19:30,269 INFO [train.py:1039] (0/4) Epoch 8, batch 2000, loss[loss=0.3063, simple_loss=0.3455, pruned_loss=0.1336, over 19371.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.2866, pruned_loss=0.0772, over 4701845.11 frames. ], batch size: 388, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:19:30,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:32,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=261226.66666666666, ans=0.035 2023-09-29 05:19:35,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:35,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:19:37,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:19:39,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:19:40,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:43,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 05:19:43,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:19:46,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:19:49,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 05:19:49,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:19:49,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:54,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:19:56,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 05:19:57,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:57,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=261293.33333333334, ans=0.0 2023-09-29 05:19:59,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:59,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:01,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 05:20:01,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:20:03,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 05:20:03,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:06,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:08,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:20:08,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:08,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:10,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:12,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 05:20:15,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 05:20:15,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:15,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:20,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:21,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:20:21,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:22,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=261426.66666666666, ans=0.125 2023-09-29 05:20:23,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:20:26,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:26,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:26,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:27,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:29,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:30,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:33,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 05:20:39,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:20:39,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:43,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:43,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:20:48,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:48,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=261493.33333333334, ans=0.125 2023-09-29 05:20:49,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:49,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:51,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:20:51,605 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=261560.0, ans=0.0 2023-09-29 05:20:52,728 INFO [train.py:1039] (0/4) Epoch 8, batch 2050, loss[loss=0.2079, simple_loss=0.2545, pruned_loss=0.08064, over 23619.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2859, pruned_loss=0.07716, over 4702983.13 frames. ], batch size: 256, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:20:52,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:20:53,386 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.95 vs. limit=15.0 2023-09-29 05:20:54,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:54,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:57,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:57,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:21:05,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:21:07,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:21:08,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:21:08,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:21:10,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=261626.66666666666, ans=0.0 2023-09-29 05:21:11,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 05:21:11,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:21:11,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:11,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:21:18,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=261626.66666666666, ans=0.125 2023-09-29 05:21:20,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=261626.66666666666, ans=0.125 2023-09-29 05:21:24,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:24,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:27,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 05:21:27,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=261693.33333333334, ans=0.1 2023-09-29 05:21:29,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:29,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 05:21:29,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=261693.33333333334, ans=10.0 2023-09-29 05:21:30,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:33,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:35,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:37,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:21:37,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:38,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:21:38,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:21:40,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:21:43,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:46,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:21:47,263 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:21:47,805 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.38 vs. limit=15.0 2023-09-29 05:21:48,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:21:50,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:54,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:21:57,317 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=261826.66666666666, ans=0.125 2023-09-29 05:21:59,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:22:01,287 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.119e+02 2.372e+02 3.018e+02 5.017e+02, threshold=4.745e+02, percent-clipped=2.0 2023-09-29 05:22:01,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 05:22:06,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:07,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:22:09,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:22:09,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 05:22:14,334 INFO [train.py:1039] (0/4) Epoch 8, batch 2100, loss[loss=0.2357, simple_loss=0.3071, pruned_loss=0.0821, over 24636.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.285, pruned_loss=0.0767, over 4711655.68 frames. ], batch size: 68, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:22:14,510 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 05:22:14,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:14,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:16,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:17,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=261893.33333333334, ans=0.125 2023-09-29 05:22:18,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:18,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 05:22:18,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 05:22:20,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:22:23,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:22:23,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:22:26,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:28,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:22:28,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 05:22:30,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:22:30,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 05:22:30,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 05:22:33,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:22:33,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:22:33,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 05:22:33,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 05:22:39,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 05:22:39,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:40,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:22:41,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:45,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:22:47,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 05:22:47,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:22:47,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 05:22:51,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 05:22:51,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=262026.66666666666, ans=0.0 2023-09-29 05:22:53,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:53,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 05:22:53,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 05:22:54,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 05:22:56,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:22:57,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:23:01,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:02,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:04,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:06,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:06,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 05:23:06,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:06,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:07,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:07,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 05:23:09,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 05:23:09,696 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=262093.33333333334, ans=0.125 2023-09-29 05:23:10,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 05:23:11,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=262093.33333333334, ans=0.1 2023-09-29 05:23:14,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:23:17,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:23:17,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 05:23:24,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:29,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:23:29,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:23:29,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:23:29,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:23:29,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=262160.0, ans=0.0 2023-09-29 05:23:31,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:23:32,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:32,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:23:32,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:23:34,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:34,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=262160.0, ans=0.125 2023-09-29 05:23:35,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 05:23:37,122 INFO [train.py:1039] (0/4) Epoch 8, batch 2150, loss[loss=0.2107, simple_loss=0.2728, pruned_loss=0.07424, over 23885.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.2842, pruned_loss=0.07644, over 4708849.70 frames. ], batch size: 195, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:23:37,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 05:23:37,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:40,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:40,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:23:40,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:23:41,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:23:45,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=262226.6666666667, ans=0.2 2023-09-29 05:23:47,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:23:50,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:51,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:54,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:23:54,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:23:54,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:23:59,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:59,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:23:59,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:24:02,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=262293.3333333333, ans=0.125 2023-09-29 05:24:03,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:04,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 05:24:08,776 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=262360.0, ans=0.125 2023-09-29 05:24:09,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:09,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:24:11,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:11,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:12,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:13,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:24:13,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:13,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:24:14,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:24:16,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 05:24:17,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:24:19,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:19,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:20,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:24:21,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:24:24,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:24,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:24:26,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:26,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 05:24:26,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:24:31,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:32,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:34,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:34,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:24:35,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:37,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:37,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 05:24:40,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 05:24:40,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:24:41,432 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 05:24:41,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:41,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:24:42,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 05:24:42,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:24:43,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 05:24:43,044 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 05:24:43,044 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 05:24:44,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 05:24:45,899 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.226e+02 2.643e+02 3.151e+02 6.561e+02, threshold=5.285e+02, percent-clipped=6.0 2023-09-29 05:24:46,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:46,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:46,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:24:47,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:47,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:24:47,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:47,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:57,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:24:58,881 INFO [train.py:1039] (0/4) Epoch 8, batch 2200, loss[loss=0.2301, simple_loss=0.3017, pruned_loss=0.07923, over 23944.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2848, pruned_loss=0.07621, over 4715928.63 frames. ], batch size: 80, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:24:59,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 05:25:04,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:25:04,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=262560.0, ans=0.5 2023-09-29 05:25:08,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:08,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=262560.0, ans=0.035 2023-09-29 05:25:10,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:25:10,392 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=262560.0, ans=0.0 2023-09-29 05:25:11,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:11,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:25:14,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:25:14,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=262626.6666666667, ans=0.125 2023-09-29 05:25:16,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:25:16,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 05:25:22,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 05:25:24,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:25:28,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 05:25:31,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=262693.3333333333, ans=0.0 2023-09-29 05:25:32,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:33,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:25:33,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:25:37,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:25:37,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 05:25:42,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:25:44,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:44,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 05:25:48,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:25:49,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:25:51,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:25:52,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:55,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 05:25:57,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:25:58,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 05:25:59,018 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=262760.0, ans=0.125 2023-09-29 05:26:00,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=262760.0, ans=0.1 2023-09-29 05:26:01,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:01,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:26:01,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:04,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:26:05,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:05,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:05,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:05,603 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:26:06,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:26:06,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:26:08,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:26:13,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:26:14,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:16,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:26:16,670 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 05:26:20,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:26:20,875 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 05:26:21,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=262893.3333333333, ans=0.125 2023-09-29 05:26:22,239 INFO [train.py:1039] (0/4) Epoch 8, batch 2250, loss[loss=0.2005, simple_loss=0.2778, pruned_loss=0.06161, over 24487.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.2855, pruned_loss=0.07645, over 4718385.83 frames. ], batch size: 66, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:26:22,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:26:23,912 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 05:26:25,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:25,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:26:27,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:28,694 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 05:26:30,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:26:32,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:38,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:26:39,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:26:40,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:42,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:42,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:45,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 05:26:45,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:47,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:26:48,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 05:26:51,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:51,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:52,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:57,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:58,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:26:58,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:27:00,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 05:27:01,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:27:02,719 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.51 vs. limit=15.0 2023-09-29 05:27:03,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:27:06,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:08,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:10,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:27:10,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:27:11,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:27:14,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:27:18,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:27:20,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=263093.3333333333, ans=0.0 2023-09-29 05:27:21,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:27:27,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:27:27,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:27:28,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:27:31,595 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.954e+02 2.186e+02 2.448e+02 4.409e+02, threshold=4.373e+02, percent-clipped=0.0 2023-09-29 05:27:36,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:27:39,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:27:39,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 05:27:39,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:39,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:27:43,939 INFO [train.py:1039] (0/4) Epoch 8, batch 2300, loss[loss=0.2244, simple_loss=0.2948, pruned_loss=0.077, over 24642.00 frames. ], tot_loss[loss=0.2208, simple_loss=0.2866, pruned_loss=0.07747, over 4702916.53 frames. ], batch size: 65, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:27:43,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 05:27:45,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:27:45,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:52,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:52,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:27:52,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=263226.6666666667, ans=0.0 2023-09-29 05:27:54,146 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 05:27:55,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:27:56,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=263226.6666666667, ans=0.2 2023-09-29 05:27:58,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=263226.6666666667, ans=0.0 2023-09-29 05:28:03,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:28:03,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:28:04,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:04,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:04,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 05:28:06,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:28:07,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:07,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:28:11,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:28:14,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:28:17,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:19,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=263360.0, ans=0.0 2023-09-29 05:28:22,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:28:22,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:22,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=263360.0, ans=0.2 2023-09-29 05:28:25,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:28:29,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:28:29,814 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=263360.0, ans=0.125 2023-09-29 05:28:32,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:34,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:28:34,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:28:34,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 05:28:37,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:28:37,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:37,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:37,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:28:39,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:40,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:28:40,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:28:42,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 05:28:42,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:28:42,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:42,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 05:28:51,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:28:51,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=263493.3333333333, ans=0.125 2023-09-29 05:28:55,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:28:58,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:58,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:28:58,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:29:01,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:29:01,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:01,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=263493.3333333333, ans=0.0 2023-09-29 05:29:03,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:29:03,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 05:29:07,055 INFO [train.py:1039] (0/4) Epoch 8, batch 2350, loss[loss=0.2229, simple_loss=0.2827, pruned_loss=0.08155, over 23701.00 frames. ], tot_loss[loss=0.2221, simple_loss=0.2878, pruned_loss=0.07825, over 4691745.16 frames. ], batch size: 149, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:29:10,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:29:10,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 05:29:16,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 05:29:18,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:29:22,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:23,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:25,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 05:29:29,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:29:35,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 05:29:36,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=263626.6666666667, ans=22.5 2023-09-29 05:29:37,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:39,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:29:39,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:41,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=263693.3333333333, ans=0.2 2023-09-29 05:29:41,625 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.08 vs. limit=15.0 2023-09-29 05:29:43,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:29:46,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 05:29:47,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:29:49,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:50,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:29:50,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:29:55,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:29:58,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 05:29:58,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:30:01,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:30:01,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:30:03,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 05:30:03,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:30:06,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 05:30:08,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:30:11,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 05:30:14,767 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.176e+02 2.529e+02 2.945e+02 4.428e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 05:30:17,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 05:30:17,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:30:17,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 05:30:17,153 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 05:30:19,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 05:30:20,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 05:30:24,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:30:28,218 INFO [train.py:1039] (0/4) Epoch 8, batch 2400, loss[loss=0.2132, simple_loss=0.2782, pruned_loss=0.07404, over 23314.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2862, pruned_loss=0.07723, over 4698032.26 frames. ], batch size: 105, lr: 1.30e-02, grad_scale: 16.0 2023-09-29 05:30:28,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:30:32,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:30:34,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:30:35,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 05:30:35,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 05:30:43,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:30:43,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:30:46,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 05:30:46,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:30:47,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:48,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=263960.0, ans=0.0 2023-09-29 05:30:49,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 05:30:56,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:58,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 05:31:03,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:31:07,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 05:31:09,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:10,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:16,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:18,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 05:31:18,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:31:23,619 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=264093.3333333333, ans=0.1 2023-09-29 05:31:24,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:28,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:31:31,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:33,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:31:33,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:31:33,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:31:33,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:33,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:31:33,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:31:37,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:31:37,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:31:39,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 05:31:39,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 05:31:40,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:40,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:41,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 05:31:41,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 05:31:41,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 05:31:41,162 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 05:31:44,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 05:31:44,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:45,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:45,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:47,824 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 05:31:50,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:50,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:31:51,855 INFO [train.py:1039] (0/4) Epoch 8, batch 2450, loss[loss=0.2096, simple_loss=0.2885, pruned_loss=0.0653, over 24324.00 frames. ], tot_loss[loss=0.2183, simple_loss=0.2839, pruned_loss=0.07636, over 4702922.36 frames. ], batch size: 74, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:31:55,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:31:55,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:59,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:59,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:01,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 05:32:06,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:06,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:09,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:32:10,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:32:10,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:32:10,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 05:32:13,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:15,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=264293.3333333333, ans=0.1 2023-09-29 05:32:16,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:32:17,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:32:19,160 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.34 vs. limit=15.0 2023-09-29 05:32:21,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:32:23,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:24,667 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=264360.0, ans=0.125 2023-09-29 05:32:25,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:25,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:32:27,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 05:32:27,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:32:31,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=264360.0, ans=0.125 2023-09-29 05:32:35,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:37,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:37,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:38,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:32:38,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:39,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:32:40,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 05:32:43,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:43,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:32:48,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:32:48,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:53,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:32:53,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 05:32:54,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:32:54,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:54,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 05:32:56,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:32:58,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:33:01,896 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.058e+02 2.397e+02 2.730e+02 4.175e+02, threshold=4.793e+02, percent-clipped=0.0 2023-09-29 05:33:05,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:33:05,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=264493.3333333333, ans=0.0 2023-09-29 05:33:07,794 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.50 vs. limit=22.5 2023-09-29 05:33:08,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:08,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:33:11,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 05:33:11,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:33:12,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=264560.0, ans=0.125 2023-09-29 05:33:13,063 INFO [train.py:1039] (0/4) Epoch 8, batch 2500, loss[loss=0.2109, simple_loss=0.2759, pruned_loss=0.07297, over 23710.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2828, pruned_loss=0.07608, over 4696315.47 frames. ], batch size: 135, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:33:16,401 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:33:19,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:28,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:33:28,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:33:31,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:31,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 05:33:37,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:33:39,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:33:40,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:33:40,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:33:41,522 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.34 vs. limit=15.0 2023-09-29 05:33:42,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 05:33:43,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:43,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:45,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 05:33:45,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:45,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 05:33:45,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:33:50,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:51,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:54,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:33:54,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 05:33:54,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:33:56,962 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.93 vs. limit=15.0 2023-09-29 05:33:57,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:34:02,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:05,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:07,889 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.68 vs. limit=22.5 2023-09-29 05:34:11,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:16,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:34:21,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 05:34:21,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:21,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:34:22,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:34:22,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:34:24,221 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 05:34:24,221 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 05:34:24,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 05:34:27,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:34:30,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 05:34:30,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 05:34:30,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=264826.6666666667, ans=0.125 2023-09-29 05:34:31,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:32,223 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=264893.3333333333, ans=0.1 2023-09-29 05:34:33,170 INFO [train.py:1039] (0/4) Epoch 8, batch 2550, loss[loss=0.2434, simple_loss=0.3143, pruned_loss=0.08627, over 23982.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.2834, pruned_loss=0.07614, over 4702220.91 frames. ], batch size: 86, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:34:33,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 05:34:36,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 05:34:39,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:41,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:34:42,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:34:44,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:46,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=264893.3333333333, ans=0.125 2023-09-29 05:34:47,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 05:34:47,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:34:51,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 05:34:53,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:34:54,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:57,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:57,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 05:34:57,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:34:57,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:34:59,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:02,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:35:02,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 05:35:02,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:35:02,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:02,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 05:35:03,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=265026.6666666667, ans=0.2 2023-09-29 05:35:14,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=265026.6666666667, ans=0.05 2023-09-29 05:35:16,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:35:20,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:20,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:20,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:35:20,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=265093.3333333333, ans=0.1 2023-09-29 05:35:22,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:35:26,077 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=265093.3333333333, ans=0.1 2023-09-29 05:35:29,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:31,010 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=265093.3333333333, ans=0.5 2023-09-29 05:35:32,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:35:32,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:35:32,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:35:33,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:35:33,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:35:35,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:36,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:40,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:35:40,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 05:35:40,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:35:41,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:43,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:35:45,164 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.988e+02 2.217e+02 2.595e+02 3.453e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 05:35:45,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:35:46,279 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.30 vs. limit=22.5 2023-09-29 05:35:46,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:35:54,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:35:55,491 INFO [train.py:1039] (0/4) Epoch 8, batch 2600, loss[loss=0.2428, simple_loss=0.3109, pruned_loss=0.08737, over 24086.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2842, pruned_loss=0.07653, over 4705694.83 frames. ], batch size: 80, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:35:57,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:00,710 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 05:36:00,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=265226.6666666667, ans=0.0 2023-09-29 05:36:02,310 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 05:36:02,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:36:03,784 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 05:36:03,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 05:36:03,937 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 05:36:06,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:36:06,994 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 05:36:08,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 05:36:09,962 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 05:36:12,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:36:14,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 05:36:16,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 05:36:17,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:36:17,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 05:36:21,521 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 05:36:21,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 05:36:24,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=265293.3333333333, ans=0.125 2023-09-29 05:36:31,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:31,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:31,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:31,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 05:36:33,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:36:39,644 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 05:36:44,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:44,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:45,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 05:36:45,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:36:45,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:47,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 05:36:50,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:36:50,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:36:54,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:55,951 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 05:36:55,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:56,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:36:58,995 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.86 vs. limit=15.0 2023-09-29 05:37:03,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:37:04,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:37:04,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 05:37:06,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:37:08,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:09,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:10,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=265493.3333333333, ans=0.0 2023-09-29 05:37:15,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 05:37:16,983 INFO [train.py:1039] (0/4) Epoch 8, batch 2650, loss[loss=0.2145, simple_loss=0.2848, pruned_loss=0.07217, over 23925.00 frames. ], tot_loss[loss=0.2187, simple_loss=0.2846, pruned_loss=0.07638, over 4712190.82 frames. ], batch size: 86, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:37:17,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:18,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:37:22,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 05:37:23,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:24,502 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.96 vs. limit=15.0 2023-09-29 05:37:25,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:37:25,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=265560.0, ans=0.0 2023-09-29 05:37:26,466 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 05:37:26,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:37:28,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:31,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:37:33,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:35,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:37:36,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 05:37:36,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:37:36,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:37:41,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 05:37:41,717 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 05:37:44,155 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=265626.6666666667, ans=0.0 2023-09-29 05:37:45,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:48,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 05:37:48,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:37:48,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 05:37:50,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=265693.3333333333, ans=0.125 2023-09-29 05:37:51,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:53,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:37:53,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:54,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:37:57,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 05:37:57,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 05:38:01,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:05,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 05:38:05,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:06,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:06,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:06,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:08,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:10,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:10,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:11,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:38:13,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:38:14,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:38:17,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:19,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:38:19,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:21,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:21,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:38:26,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:28,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:38:28,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:29,355 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.037e+02 2.259e+02 2.622e+02 3.986e+02, threshold=4.518e+02, percent-clipped=0.0 2023-09-29 05:38:29,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 05:38:33,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:34,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:35,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:35,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:37,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:39,334 INFO [train.py:1039] (0/4) Epoch 8, batch 2700, loss[loss=0.2043, simple_loss=0.2807, pruned_loss=0.06395, over 24641.00 frames. ], tot_loss[loss=0.2176, simple_loss=0.2843, pruned_loss=0.07546, over 4728575.76 frames. ], batch size: 65, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:38:39,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:41,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:38:41,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 05:38:45,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:38:46,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 05:38:48,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:48,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:48,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:50,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:38:50,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:51,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:38:51,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:38:51,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 05:38:51,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:38:54,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:54,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:38:56,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:39:01,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:39:01,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 05:39:03,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:03,423 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=265960.0, ans=0.2 2023-09-29 05:39:09,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:39:09,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:09,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=265960.0, ans=0.125 2023-09-29 05:39:15,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:39:15,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:39:15,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:39:15,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:39:20,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:20,989 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.34 vs. limit=15.0 2023-09-29 05:39:23,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:23,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:39:23,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:39:27,228 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.86 vs. limit=15.0 2023-09-29 05:39:29,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:29,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:39:35,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=266093.3333333333, ans=0.0 2023-09-29 05:39:36,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:39:38,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:39:38,959 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.23 vs. limit=6.0 2023-09-29 05:39:41,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:39:41,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:44,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:45,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=266160.0, ans=0.035 2023-09-29 05:39:46,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:46,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:48,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:39:48,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:48,564 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=266160.0, ans=0.0 2023-09-29 05:39:50,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:39:52,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:53,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:53,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:56,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 05:39:56,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:59,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:39:59,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 05:40:01,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 05:40:01,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=266226.6666666667, ans=0.025 2023-09-29 05:40:02,919 INFO [train.py:1039] (0/4) Epoch 8, batch 2750, loss[loss=0.2316, simple_loss=0.2908, pruned_loss=0.08618, over 23621.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2846, pruned_loss=0.07542, over 4741298.09 frames. ], batch size: 149, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:40:03,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:04,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:04,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:08,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:08,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:40:08,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:12,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:13,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:40:13,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:40:13,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:13,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 05:40:13,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:40:13,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:22,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 05:40:24,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:40:25,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:25,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:40:25,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:40:27,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:28,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:40:29,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:30,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:32,460 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:40:33,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:40:35,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:40:36,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:40:36,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:38,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:40:45,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:48,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:40:49,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:51,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:51,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:40:53,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:40:55,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=266426.6666666667, ans=0.2 2023-09-29 05:40:58,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:41:00,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:41:00,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 05:41:04,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:06,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 05:41:11,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:41:13,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:41:14,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 05:41:14,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:41:15,922 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 2.077e+02 2.428e+02 2.871e+02 4.392e+02, threshold=4.857e+02, percent-clipped=0.0 2023-09-29 05:41:18,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:41:18,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 05:41:18,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:41:18,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=266493.3333333333, ans=0.2 2023-09-29 05:41:20,241 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=266493.3333333333, ans=0.125 2023-09-29 05:41:21,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:41:21,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:21,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:41:22,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 05:41:23,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:23,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:25,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:25,245 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 05:41:25,246 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 05:41:26,530 INFO [train.py:1039] (0/4) Epoch 8, batch 2800, loss[loss=0.2388, simple_loss=0.2869, pruned_loss=0.09537, over 23817.00 frames. ], tot_loss[loss=0.2165, simple_loss=0.283, pruned_loss=0.07499, over 4722922.60 frames. ], batch size: 179, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:41:31,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:33,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:41:33,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:41:37,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:41:39,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 05:41:39,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=266560.0, ans=0.125 2023-09-29 05:41:42,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:41:43,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 05:41:45,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:46,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:41:46,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:41:50,317 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-40000.pt 2023-09-29 05:41:53,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:41:53,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:53,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:41:55,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:42:04,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:42:07,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:09,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:11,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:42:11,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:17,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:17,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 05:42:17,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:19,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:19,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:42:21,221 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=266760.0, ans=0.0 2023-09-29 05:42:22,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=266760.0, ans=0.125 2023-09-29 05:42:23,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:25,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:27,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:30,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:42:30,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:30,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:42:32,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:42:33,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:42:34,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:42:34,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 05:42:34,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:36,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:42:36,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:38,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 05:42:38,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:38,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:42:38,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=266826.6666666667, ans=22.5 2023-09-29 05:42:39,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:42:41,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 05:42:48,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:48,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:42:49,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:42:51,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:42:51,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=266893.3333333333, ans=0.1 2023-09-29 05:42:52,723 INFO [train.py:1039] (0/4) Epoch 8, batch 2850, loss[loss=0.197, simple_loss=0.2657, pruned_loss=0.06412, over 21952.00 frames. ], tot_loss[loss=0.2163, simple_loss=0.2826, pruned_loss=0.07503, over 4723768.95 frames. ], batch size: 48, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:42:57,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:42:57,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:57,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:58,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:00,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:43:02,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:43:04,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 05:43:08,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=266960.0, ans=0.2 2023-09-29 05:43:11,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 05:43:11,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:13,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 05:43:13,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:16,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 05:43:16,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 05:43:20,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:32,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:33,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:35,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:43:35,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:43:37,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:43:37,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:43:39,426 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=267026.6666666667, ans=0.125 2023-09-29 05:43:40,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:43:40,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 05:43:42,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:43:42,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:43:44,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:44,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:46,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:46,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:48,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:49,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:52,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:43:52,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:54,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:57,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:44:02,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:44:04,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 05:44:04,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 05:44:05,428 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.097e+02 2.299e+02 2.689e+02 7.485e+02, threshold=4.599e+02, percent-clipped=2.0 2023-09-29 05:44:07,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:44:07,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:07,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 05:44:08,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:44:10,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:10,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:10,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:44:10,248 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 05:44:11,580 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 05:44:11,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:13,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:15,346 INFO [train.py:1039] (0/4) Epoch 8, batch 2900, loss[loss=0.2399, simple_loss=0.2941, pruned_loss=0.09279, over 23367.00 frames. ], tot_loss[loss=0.217, simple_loss=0.2831, pruned_loss=0.07547, over 4723755.14 frames. ], batch size: 285, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:44:15,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:17,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:17,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:44:19,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 05:44:23,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:23,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 05:44:24,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 05:44:25,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:44:25,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:44:27,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:29,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:44:33,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:33,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:37,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:44:38,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 05:44:39,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:44:39,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=267293.3333333333, ans=0.125 2023-09-29 05:44:41,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:41,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=267293.3333333333, ans=0.1 2023-09-29 05:44:42,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 05:44:44,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 05:44:45,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:45,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 05:44:45,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:44:49,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:44:49,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:52,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:53,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=267360.0, ans=0.125 2023-09-29 05:44:56,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:58,735 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.31 vs. limit=15.0 2023-09-29 05:44:59,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:45:03,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:04,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 05:45:04,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 05:45:04,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:45:09,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:45:12,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 05:45:12,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=267426.6666666667, ans=0.09899494936611666 2023-09-29 05:45:13,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:45:14,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=267426.6666666667, ans=0.125 2023-09-29 05:45:18,862 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=267493.3333333333, ans=0.125 2023-09-29 05:45:20,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:45:28,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:45:28,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:45:30,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 05:45:33,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:33,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 05:45:34,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:34,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:45:36,153 INFO [train.py:1039] (0/4) Epoch 8, batch 2950, loss[loss=0.2465, simple_loss=0.3037, pruned_loss=0.09465, over 23813.00 frames. ], tot_loss[loss=0.2182, simple_loss=0.2845, pruned_loss=0.07593, over 4718991.72 frames. ], batch size: 212, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:45:41,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:43,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 05:45:43,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:44,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:46,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:45:48,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:45:49,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 05:45:51,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 05:45:51,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:45:51,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:53,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=267626.6666666667, ans=0.07 2023-09-29 05:45:56,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:45:59,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:01,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:01,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:02,919 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.28 vs. limit=22.5 2023-09-29 05:46:03,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=267626.6666666667, ans=0.0 2023-09-29 05:46:04,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:05,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:46:06,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:08,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:08,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:46:08,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=267693.3333333333, ans=0.0 2023-09-29 05:46:09,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 05:46:16,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 05:46:18,108 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 05:46:19,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:46:21,163 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 05:46:21,416 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=267693.3333333333, ans=0.125 2023-09-29 05:46:22,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 05:46:22,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:24,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:46:24,233 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 05:46:24,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:46:27,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 05:46:28,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:30,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:46:33,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:35,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:46:35,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:35,176 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 05:46:36,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:36,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 05:46:36,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=267760.0, ans=0.2 2023-09-29 05:46:43,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:45,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:46:45,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 05:46:45,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:46:47,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 05:46:50,058 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.20 vs. limit=22.5 2023-09-29 05:46:50,562 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.933e+02 2.152e+02 2.474e+02 4.181e+02, threshold=4.303e+02, percent-clipped=1.0 2023-09-29 05:46:50,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:52,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:52,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:46:55,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:55,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:46:55,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:46:57,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:57,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:46:57,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:46:57,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:57,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=267893.3333333333, ans=0.125 2023-09-29 05:46:58,567 INFO [train.py:1039] (0/4) Epoch 8, batch 3000, loss[loss=0.2075, simple_loss=0.2698, pruned_loss=0.07257, over 23608.00 frames. ], tot_loss[loss=0.2179, simple_loss=0.2846, pruned_loss=0.07559, over 4726746.64 frames. ], batch size: 134, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:46:58,568 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 05:47:12,756 INFO [train.py:1071] (0/4) Epoch 8, validation: loss=0.3012, simple_loss=0.2865, pruned_loss=0.1579, over 1125622.00 frames. 2023-09-29 05:47:12,757 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 05:47:14,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:47:15,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:15,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 05:47:16,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:18,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:47:20,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:47:23,402 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 05:47:23,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 05:47:25,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:47:26,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:47:26,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 05:47:26,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:32,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:47:39,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=267960.0, ans=0.125 2023-09-29 05:47:45,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:47:47,441 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.47 vs. limit=22.5 2023-09-29 05:47:51,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 05:47:52,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:47:56,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:47:57,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:57,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:47:59,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:47:59,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 05:48:01,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 05:48:04,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:48:04,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:48:05,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:48:05,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:07,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:07,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:48:09,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=268093.3333333333, ans=0.2 2023-09-29 05:48:11,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:48:12,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:48:12,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:48:13,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:16,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 05:48:16,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:48:16,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:17,511 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.03 vs. limit=15.0 2023-09-29 05:48:19,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:48:22,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:22,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:25,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:48:25,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 05:48:25,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:48:26,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 05:48:26,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:48:31,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 05:48:34,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:48:34,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:48:34,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 05:48:35,784 INFO [train.py:1039] (0/4) Epoch 8, batch 3050, loss[loss=0.2092, simple_loss=0.2978, pruned_loss=0.06027, over 24541.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2855, pruned_loss=0.07614, over 4719065.13 frames. ], batch size: 71, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:48:37,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 05:48:37,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:48:37,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:48:38,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:38,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:48:38,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:38,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:48:42,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 05:48:44,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:48:47,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:48:47,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:48:51,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:55,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 05:49:03,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 05:49:03,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 05:49:03,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:06,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:49:10,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:10,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:10,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:11,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:13,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:49:13,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:14,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:14,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:14,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:17,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:20,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:20,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 05:49:22,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:22,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:49:27,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:49:27,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:49:27,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:49:29,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:29,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.whiten.whitening_limit, batch_count=268426.6666666667, ans=12.0 2023-09-29 05:49:34,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:36,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:38,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=268426.6666666667, ans=0.125 2023-09-29 05:49:42,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:42,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:49:42,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:44,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:44,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:49:44,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:45,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 05:49:47,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:48,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:50,195 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.057e+02 2.311e+02 2.662e+02 5.288e+02, threshold=4.621e+02, percent-clipped=1.0 2023-09-29 05:49:50,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 05:49:51,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:56,334 INFO [train.py:1039] (0/4) Epoch 8, batch 3100, loss[loss=0.2107, simple_loss=0.2939, pruned_loss=0.06379, over 24666.00 frames. ], tot_loss[loss=0.218, simple_loss=0.2846, pruned_loss=0.0757, over 4717835.94 frames. ], batch size: 73, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:49:57,461 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.00 vs. limit=15.0 2023-09-29 05:49:57,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:58,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:49:59,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:50:02,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 05:50:05,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 05:50:06,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 05:50:06,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=268560.0, ans=0.0 2023-09-29 05:50:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:50:13,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:50:13,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:15,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:50:15,502 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=268626.6666666667, ans=0.125 2023-09-29 05:50:18,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:24,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 05:50:29,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=268693.3333333333, ans=0.125 2023-09-29 05:50:30,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:50:30,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:31,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:31,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:50:33,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:50:35,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:50:35,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 05:50:35,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:50:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:39,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 05:50:41,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:50:44,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:50:44,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 05:50:46,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 05:50:47,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:47,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:49,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:50:49,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:49,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:50:51,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:50:51,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:50:51,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=268760.0, ans=0.0 2023-09-29 05:50:53,244 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.72 vs. limit=15.0 2023-09-29 05:50:54,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:50:54,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:50:54,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:54,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 05:50:57,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:59,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 05:51:00,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:51:02,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 05:51:02,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:02,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:03,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 05:51:10,693 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=268826.6666666667, ans=0.125 2023-09-29 05:51:16,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 05:51:18,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=268893.3333333333, ans=0.0 2023-09-29 05:51:19,613 INFO [train.py:1039] (0/4) Epoch 8, batch 3150, loss[loss=0.2104, simple_loss=0.2955, pruned_loss=0.0626, over 24661.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2832, pruned_loss=0.0758, over 4700472.19 frames. ], batch size: 73, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:51:21,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:21,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:23,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=268893.3333333333, ans=0.2 2023-09-29 05:51:24,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:51:24,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:51:25,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 05:51:27,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:27,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:51:28,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 05:51:30,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:32,085 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 05:51:32,747 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.96 vs. limit=15.0 2023-09-29 05:51:35,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 05:51:35,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:51:36,685 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 05:51:38,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:51:38,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 05:51:38,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=268960.0, ans=0.125 2023-09-29 05:51:40,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 05:51:40,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 05:51:40,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:40,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:51:41,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:43,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 05:51:45,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=268960.0, ans=0.1 2023-09-29 05:51:47,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:51,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:51:54,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 05:51:54,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:51:55,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:51:57,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:57,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 05:51:59,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 05:52:00,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:52:02,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:52:02,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:52:02,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:02,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:52:03,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:52:03,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:52:05,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 05:52:05,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:52:05,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:07,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:52:07,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:52:07,683 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.75 vs. limit=15.0 2023-09-29 05:52:09,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 05:52:09,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:12,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 05:52:12,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:13,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 05:52:15,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 05:52:15,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:52:17,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:18,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 05:52:19,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:52:21,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:23,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:52:25,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:26,079 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.68 vs. limit=6.0 2023-09-29 05:52:26,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:52:28,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=269160.0, ans=0.1 2023-09-29 05:52:31,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:52:31,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:33,425 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.41 vs. limit=6.0 2023-09-29 05:52:34,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:52:35,515 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.026e+02 2.271e+02 2.794e+02 4.211e+02, threshold=4.543e+02, percent-clipped=0.0 2023-09-29 05:52:35,982 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=269160.0, ans=0.125 2023-09-29 05:52:38,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:52:38,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:52:42,182 INFO [train.py:1039] (0/4) Epoch 8, batch 3200, loss[loss=0.2251, simple_loss=0.2902, pruned_loss=0.08005, over 23716.00 frames. ], tot_loss[loss=0.2159, simple_loss=0.2818, pruned_loss=0.07494, over 4706379.69 frames. ], batch size: 135, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:52:43,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:46,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:52:46,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 05:52:50,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:57,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:53:02,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:53:11,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:53:21,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=269360.0, ans=10.0 2023-09-29 05:53:22,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 05:53:24,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:53:26,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=269360.0, ans=0.5 2023-09-29 05:53:28,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 05:53:29,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:53:31,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:53:31,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:53:33,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:53:38,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 05:53:39,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:53:43,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 05:53:46,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 05:53:47,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:53:52,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:53,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:53:53,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:53,998 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 05:53:54,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:53:55,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=269493.3333333333, ans=0.0 2023-09-29 05:53:55,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=269493.3333333333, ans=0.0 2023-09-29 05:53:59,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:00,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 05:54:00,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 05:54:02,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 05:54:03,733 INFO [train.py:1039] (0/4) Epoch 8, batch 3250, loss[loss=0.2165, simple_loss=0.2886, pruned_loss=0.07215, over 23728.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2824, pruned_loss=0.07487, over 4712253.49 frames. ], batch size: 85, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:54:03,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 05:54:06,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:54:08,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:54:11,142 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 05:54:11,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:11,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:12,678 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 05:54:17,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:54:18,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:26,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:54:26,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 05:54:28,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:28,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:54:28,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:29,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:30,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:54:34,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:54:34,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:34,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:35,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:54:38,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:39,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:42,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:42,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:44,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:45,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:45,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:54:47,282 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.72 vs. limit=6.0 2023-09-29 05:54:49,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 05:54:49,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:49,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:54:52,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:52,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:55:00,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:55:08,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:09,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:09,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 05:55:09,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:55:09,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:55:09,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:13,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 05:55:13,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 05:55:13,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:55:14,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:16,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:16,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:55:17,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=269826.6666666667, ans=0.1 2023-09-29 05:55:18,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:19,807 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.998e+02 2.232e+02 2.545e+02 3.931e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 05:55:21,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:23,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:24,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 05:55:24,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:25,870 INFO [train.py:1039] (0/4) Epoch 8, batch 3300, loss[loss=0.1936, simple_loss=0.2682, pruned_loss=0.05947, over 24645.00 frames. ], tot_loss[loss=0.2163, simple_loss=0.2829, pruned_loss=0.07487, over 4699178.93 frames. ], batch size: 65, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:55:27,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:55:27,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 05:55:30,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:30,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 05:55:32,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 05:55:33,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 05:55:33,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:38,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:40,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:55:40,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:43,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:55:43,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:55:45,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:46,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:50,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 05:55:50,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:55:50,873 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.60 vs. limit=6.0 2023-09-29 05:55:51,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:51,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:52,952 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.89 vs. limit=10.0 2023-09-29 05:55:53,875 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 05:55:55,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:55:55,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:55:56,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:55:56,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:55:57,033 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 05:56:01,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:01,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:56:04,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:04,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 05:56:06,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 05:56:06,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:07,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:56:09,560 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 05:56:11,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 05:56:11,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:14,087 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.92 vs. limit=15.0 2023-09-29 05:56:15,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 05:56:18,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:19,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:56:21,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:23,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:23,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:23,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:23,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:56:26,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:56:26,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:26,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:56:28,431 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 05:56:28,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=270093.3333333333, ans=0.125 2023-09-29 05:56:29,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 05:56:31,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:56:32,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:56:32,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:33,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=270160.0, ans=0.125 2023-09-29 05:56:34,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:34,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:36,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:56:37,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:37,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:56:37,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:39,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:56:43,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 05:56:44,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:45,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:46,933 INFO [train.py:1039] (0/4) Epoch 8, batch 3350, loss[loss=0.1924, simple_loss=0.2623, pruned_loss=0.06123, over 24289.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.2842, pruned_loss=0.07572, over 4703509.95 frames. ], batch size: 56, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:56:47,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:56:47,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:49,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:52,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:52,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:52,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=270226.6666666667, ans=0.125 2023-09-29 05:56:55,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:57,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:58,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:59,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:02,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:57:04,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:04,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:57:04,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=270293.3333333333, ans=0.2 2023-09-29 05:57:05,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 05:57:08,670 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 05:57:08,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:11,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 05:57:11,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 05:57:14,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:57:14,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:57:16,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:16,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 05:57:16,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:16,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:57:18,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:21,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:21,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:23,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:57:25,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:28,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:28,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:34,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:57:36,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:38,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:38,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:39,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:41,597 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=270426.6666666667, ans=0.0 2023-09-29 05:57:43,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 05:57:43,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:57:43,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 05:57:43,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:57:44,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=270426.6666666667, ans=0.2 2023-09-29 05:57:45,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 05:57:46,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:46,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:54,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:56,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 05:57:56,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:57:58,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:57:59,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:58:02,775 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 2.038e+02 2.381e+02 2.842e+02 4.419e+02, threshold=4.763e+02, percent-clipped=0.0 2023-09-29 05:58:04,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:06,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 05:58:06,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:58:08,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:58:08,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=270560.0, ans=0.0 2023-09-29 05:58:09,752 INFO [train.py:1039] (0/4) Epoch 8, batch 3400, loss[loss=0.2028, simple_loss=0.2808, pruned_loss=0.06244, over 24380.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.2852, pruned_loss=0.07622, over 4711641.38 frames. ], batch size: 77, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:58:09,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:09,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 05:58:11,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:58:12,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 05:58:13,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:58:16,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:58:16,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 05:58:19,870 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=270560.0, ans=0.09899494936611666 2023-09-29 05:58:21,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 05:58:22,518 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 05:58:22,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:58:27,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:27,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:58:29,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:29,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:58:31,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=270626.6666666667, ans=0.0 2023-09-29 05:58:31,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=270626.6666666667, ans=0.1 2023-09-29 05:58:33,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=270626.6666666667, ans=0.2 2023-09-29 05:58:37,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:58:39,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 05:58:39,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=270626.6666666667, ans=0.125 2023-09-29 05:58:42,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=270693.3333333333, ans=0.125 2023-09-29 05:58:42,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=270693.3333333333, ans=0.0 2023-09-29 05:58:43,103 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.15 vs. limit=15.0 2023-09-29 05:58:44,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:58:46,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:47,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:48,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:58:54,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:58:54,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=270693.3333333333, ans=0.0 2023-09-29 05:58:56,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 05:59:03,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 05:59:07,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:07,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:09,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:59:09,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:59:09,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=270760.0, ans=0.0 2023-09-29 05:59:11,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:59:14,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:59:14,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:59:20,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:21,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 05:59:26,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:59:31,896 INFO [train.py:1039] (0/4) Epoch 8, batch 3450, loss[loss=0.2241, simple_loss=0.2854, pruned_loss=0.08135, over 23793.00 frames. ], tot_loss[loss=0.22, simple_loss=0.2857, pruned_loss=0.07712, over 4709724.33 frames. ], batch size: 149, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:59:32,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 05:59:33,822 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=270893.3333333333, ans=0.0 2023-09-29 05:59:38,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 05:59:38,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:40,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:59:40,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 05:59:42,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:45,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:59:50,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:59:52,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:59:53,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:59:53,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:55,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:56,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=270960.0, ans=0.0 2023-09-29 06:00:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 06:00:06,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 06:00:06,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:00:06,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=271026.6666666667, ans=0.0 2023-09-29 06:00:08,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:00:08,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:15,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 06:00:15,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:00:20,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:20,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:00:23,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:00:25,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:00:25,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=271093.3333333333, ans=0.1 2023-09-29 06:00:26,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 06:00:26,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:28,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:00:31,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:00:33,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 06:00:36,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:00:36,959 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=271160.0, ans=0.125 2023-09-29 06:00:41,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:00:42,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:46,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:00:47,987 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.293e+02 2.662e+02 4.290e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 06:00:51,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:51,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:52,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:00:52,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:54,262 INFO [train.py:1039] (0/4) Epoch 8, batch 3500, loss[loss=0.2071, simple_loss=0.2717, pruned_loss=0.07123, over 23760.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.284, pruned_loss=0.07675, over 4695317.00 frames. ], batch size: 135, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:00:56,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:01:00,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:01:01,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 06:01:03,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:01:06,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:01:08,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=271226.6666666667, ans=0.0 2023-09-29 06:01:09,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:01:09,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 06:01:11,829 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.59 vs. limit=6.0 2023-09-29 06:01:14,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:01:15,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:01:15,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:01:17,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:17,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:01:17,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:19,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:19,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 06:01:22,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:22,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:01:24,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:28,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:28,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 06:01:29,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:29,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=271360.0, ans=0.04949747468305833 2023-09-29 06:01:32,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:34,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:01:36,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:39,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:01:39,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:40,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 06:01:42,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 06:01:42,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=271426.6666666667, ans=0.125 2023-09-29 06:01:43,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 06:01:43,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:46,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:46,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:46,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:01:49,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:01:51,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:01:57,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:01:58,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 06:01:58,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 06:01:58,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:03,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:05,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:06,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:08,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 06:02:08,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:10,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:02:12,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 06:02:13,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 06:02:16,434 INFO [train.py:1039] (0/4) Epoch 8, batch 3550, loss[loss=0.2265, simple_loss=0.2806, pruned_loss=0.08618, over 23820.00 frames. ], tot_loss[loss=0.2165, simple_loss=0.2822, pruned_loss=0.07538, over 4700625.04 frames. ], batch size: 212, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:02:16,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:18,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:19,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:19,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:22,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:02:28,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=271560.0, ans=0.2 2023-09-29 06:02:31,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:32,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:02:36,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:02:37,428 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.60 vs. limit=15.0 2023-09-29 06:02:37,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:02:39,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:40,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:02:41,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:02:44,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:45,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:02:45,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:47,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:02:47,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:02:51,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=271693.3333333333, ans=0.125 2023-09-29 06:02:52,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:02:52,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:54,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:02:54,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:55,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:02:55,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 06:02:55,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:57,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:58,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 06:03:05,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:05,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=271760.0, ans=0.025 2023-09-29 06:03:07,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:03:08,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:08,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 06:03:09,233 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.31 vs. limit=15.0 2023-09-29 06:03:10,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:03:12,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 06:03:12,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:03:15,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:03:15,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:03:19,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=271760.0, ans=0.125 2023-09-29 06:03:19,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 06:03:20,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:24,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:25,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 06:03:25,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:30,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:03:31,726 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.035e+02 2.264e+02 2.526e+02 3.446e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 06:03:31,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 06:03:39,314 INFO [train.py:1039] (0/4) Epoch 8, batch 3600, loss[loss=0.2174, simple_loss=0.2971, pruned_loss=0.06887, over 24470.00 frames. ], tot_loss[loss=0.2158, simple_loss=0.2816, pruned_loss=0.07501, over 4691529.39 frames. ], batch size: 66, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:03:40,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 06:03:41,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:03:41,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:03:44,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:44,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:46,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:03:50,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:03:52,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:54,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:03:54,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:03:55,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:55,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 06:04:00,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:04:02,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:04,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:06,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:07,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:04:07,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:04:07,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 06:04:09,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:11,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:13,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:04:16,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:17,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:19,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:20,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 06:04:23,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=272026.6666666667, ans=0.0 2023-09-29 06:04:27,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:29,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:04:29,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 06:04:34,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:04:41,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:42,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:47,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:04:47,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:04:49,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 06:04:49,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 06:04:52,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 06:04:54,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:54,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:04:55,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 06:04:57,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:04:57,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:04:58,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:59,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 06:05:00,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 06:05:01,992 INFO [train.py:1039] (0/4) Epoch 8, batch 3650, loss[loss=0.2242, simple_loss=0.2801, pruned_loss=0.0842, over 23458.00 frames. ], tot_loss[loss=0.2168, simple_loss=0.2828, pruned_loss=0.07542, over 4702095.09 frames. ], batch size: 285, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:05:03,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:05:03,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 06:05:06,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=272226.6666666667, ans=0.125 2023-09-29 06:05:07,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 06:05:09,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:05:12,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 06:05:14,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 06:05:19,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:19,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:05:19,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:05:22,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=272293.3333333333, ans=0.0 2023-09-29 06:05:24,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 06:05:25,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:05:26,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 06:05:27,358 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.07 vs. limit=15.0 2023-09-29 06:05:28,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:05:28,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:28,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=272293.3333333333, ans=0.125 2023-09-29 06:05:30,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 06:05:32,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:05:32,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:05:32,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:35,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:05:36,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 06:05:38,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 06:05:39,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:05:41,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 06:05:43,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:05:43,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:05:47,673 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=272360.0, ans=0.125 2023-09-29 06:05:48,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:05:50,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:50,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:05:51,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:05:53,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:05:55,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:05:57,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:58,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:58,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:06:00,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:06:03,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:06:03,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:11,357 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 06:06:14,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:16,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:16,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:06:16,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:17,949 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.110e+02 2.350e+02 2.595e+02 3.564e+02, threshold=4.700e+02, percent-clipped=0.0 2023-09-29 06:06:18,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:06:18,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:21,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 06:06:21,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:23,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:06:25,106 INFO [train.py:1039] (0/4) Epoch 8, batch 3700, loss[loss=0.2251, simple_loss=0.287, pruned_loss=0.08158, over 23701.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.2847, pruned_loss=0.07608, over 4713192.77 frames. ], batch size: 135, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:06:26,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:06:28,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:06:30,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:30,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 06:06:30,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:31,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:06:32,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:06:36,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:06:38,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=272560.0, ans=0.125 2023-09-29 06:06:40,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:41,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:41,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:06:41,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:43,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:06:46,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:48,215 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 06:06:57,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:06:57,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:06:58,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:06:58,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 06:06:58,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:07:02,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:03,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 06:07:05,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:07,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:07:10,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:10,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:07:12,500 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.06 vs. limit=22.5 2023-09-29 06:07:13,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:07:18,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:07:18,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 06:07:18,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:07:19,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 06:07:24,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:07:25,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:07:29,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:29,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 06:07:32,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:07:32,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:07:32,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:32,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:37,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:39,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 06:07:39,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 06:07:41,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:07:41,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:07:42,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:07:43,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:07:46,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:47,585 INFO [train.py:1039] (0/4) Epoch 8, batch 3750, loss[loss=0.2041, simple_loss=0.2836, pruned_loss=0.06229, over 24435.00 frames. ], tot_loss[loss=0.2194, simple_loss=0.2858, pruned_loss=0.07647, over 4709842.06 frames. ], batch size: 69, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:07:47,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:07:49,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:07:49,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=272893.3333333333, ans=0.125 2023-09-29 06:07:52,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 06:07:54,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:07:57,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:07:57,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 06:07:58,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:08:00,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:01,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:05,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:07,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:10,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:08:11,982 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.23 vs. limit=15.0 2023-09-29 06:08:12,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:08:16,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:08:18,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:20,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 06:08:20,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:22,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:22,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:25,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 06:08:29,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=273026.6666666667, ans=0.0 2023-09-29 06:08:30,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 06:08:31,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:32,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:33,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:34,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=273026.6666666667, ans=0.04949747468305833 2023-09-29 06:08:37,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:41,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:08:44,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 06:08:46,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:48,518 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.90 vs. limit=15.0 2023-09-29 06:08:49,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:51,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:08:54,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:08:54,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=273160.0, ans=0.1 2023-09-29 06:08:57,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:08:59,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:09:01,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:09:02,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:09:04,243 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.271e+02 2.610e+02 3.277e+02 5.264e+02, threshold=5.220e+02, percent-clipped=1.0 2023-09-29 06:09:05,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:09:11,160 INFO [train.py:1039] (0/4) Epoch 8, batch 3800, loss[loss=0.2045, simple_loss=0.2696, pruned_loss=0.06967, over 24280.00 frames. ], tot_loss[loss=0.2197, simple_loss=0.2857, pruned_loss=0.07682, over 4703971.34 frames. ], batch size: 56, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:09:16,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:09:20,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:22,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:09:22,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 06:09:24,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:27,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:27,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:09:29,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=273293.3333333333, ans=0.0 2023-09-29 06:09:30,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:09:30,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:30,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:09:32,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:32,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:09:32,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:34,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 06:09:39,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:09:39,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:09:41,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:45,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:09:47,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:09:49,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:09:49,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:52,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:57,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:09:57,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 06:10:00,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:05,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:11,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:10:14,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 06:10:14,900 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.71 vs. limit=15.0 2023-09-29 06:10:15,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 06:10:15,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:19,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:20,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:20,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 06:10:24,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 06:10:24,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 06:10:26,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:27,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:33,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:10:34,262 INFO [train.py:1039] (0/4) Epoch 8, batch 3850, loss[loss=0.201, simple_loss=0.269, pruned_loss=0.06649, over 23704.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2848, pruned_loss=0.07652, over 4700636.65 frames. ], batch size: 149, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:10:34,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:10:39,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=273560.0, ans=0.125 2023-09-29 06:10:40,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:10:41,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 06:10:42,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:10:42,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=273560.0, ans=0.0 2023-09-29 06:10:44,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:47,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:10:48,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:50,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:10:53,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 06:11:00,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:01,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:11:05,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:05,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:11:09,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:10,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:11:11,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:11,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:11:13,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:14,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:14,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:15,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:11:16,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 06:11:16,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 06:11:18,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:18,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:21,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 06:11:24,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 06:11:26,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:28,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 06:11:30,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:11:35,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:37,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:39,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=273826.6666666667, ans=0.125 2023-09-29 06:11:43,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:43,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 06:11:45,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 06:11:45,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=273826.6666666667, ans=0.0 2023-09-29 06:11:48,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:48,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:50,372 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.044e+02 2.316e+02 2.829e+02 5.158e+02, threshold=4.631e+02, percent-clipped=0.0 2023-09-29 06:11:52,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:11:52,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:11:52,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:11:53,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 06:11:55,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:56,709 INFO [train.py:1039] (0/4) Epoch 8, batch 3900, loss[loss=0.1794, simple_loss=0.2534, pruned_loss=0.05268, over 24312.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2832, pruned_loss=0.07586, over 4693806.24 frames. ], batch size: 61, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:11:56,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 06:11:56,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:56,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:58,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:11:59,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:02,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:12:02,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:12:02,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:12:04,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:04,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 06:12:05,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:08,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:10,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:10,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:12:11,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:13,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:15,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:16,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:12:18,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 06:12:18,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:19,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 06:12:20,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:21,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 06:12:23,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 06:12:25,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=273960.0, ans=0.0 2023-09-29 06:12:28,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:29,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:29,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:12:31,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:12:33,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=274026.6666666667, ans=0.125 2023-09-29 06:12:34,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:36,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:12:39,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:12:39,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:12:39,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:12:45,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:45,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:12:53,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:12:55,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:13:05,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:08,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:08,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 06:13:08,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 06:13:08,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:10,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 06:13:12,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:13:13,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 06:13:15,736 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=274160.0, ans=0.125 2023-09-29 06:13:15,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=274160.0, ans=0.125 2023-09-29 06:13:20,396 INFO [train.py:1039] (0/4) Epoch 8, batch 3950, loss[loss=0.2077, simple_loss=0.2708, pruned_loss=0.07229, over 23590.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2825, pruned_loss=0.07518, over 4696949.12 frames. ], batch size: 149, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:13:22,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:13:23,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 06:13:23,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:13:26,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:13:29,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:13:36,105 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 06:13:36,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:36,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 06:13:37,609 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 06:13:37,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:40,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:40,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:13:40,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:45,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 06:13:47,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:13:49,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:49,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:13:49,908 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.98 vs. limit=15.0 2023-09-29 06:13:50,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:13:52,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:13:54,000 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=274360.0, ans=0.125 2023-09-29 06:13:55,600 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:14:03,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:14:03,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:14:07,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 06:14:12,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 06:14:12,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 06:14:14,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:14:15,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:14:24,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:14:25,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:14:26,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:14:26,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:14:26,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 06:14:31,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:14:32,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:14:36,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 06:14:36,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=274493.3333333333, ans=0.125 2023-09-29 06:14:37,533 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.764e+02 2.191e+02 2.483e+02 2.950e+02 4.567e+02, threshold=4.966e+02, percent-clipped=0.0 2023-09-29 06:14:42,739 INFO [train.py:1039] (0/4) Epoch 8, batch 4000, loss[loss=0.2379, simple_loss=0.295, pruned_loss=0.09041, over 22767.00 frames. ], tot_loss[loss=0.2169, simple_loss=0.2833, pruned_loss=0.07522, over 4701663.46 frames. ], batch size: 322, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:14:46,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:14:55,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:15:00,978 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=274626.6666666667, ans=0.0 2023-09-29 06:15:02,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:02,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:15:02,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:15:03,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 06:15:03,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:15:04,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 06:15:06,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:15:06,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 06:15:08,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:11,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:15:11,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:15:11,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:15:12,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:12,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:15:14,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:15:15,909 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 06:15:17,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:15:17,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:18,162 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.87 vs. limit=10.0 2023-09-29 06:15:21,111 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 06:15:22,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:15:22,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:28,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 06:15:31,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:32,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:15:34,629 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 06:15:36,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:15:36,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 06:15:36,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:15:36,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:37,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:15:39,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:15:41,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:15:41,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:43,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 06:15:43,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:46,076 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 06:15:52,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:15:54,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:15:56,519 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.38 vs. limit=22.5 2023-09-29 06:15:57,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:15:57,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:58,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:16:01,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:04,755 INFO [train.py:1039] (0/4) Epoch 8, batch 4050, loss[loss=0.2337, simple_loss=0.3089, pruned_loss=0.07924, over 24428.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2835, pruned_loss=0.07449, over 4723130.79 frames. ], batch size: 77, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:16:08,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:16:11,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:16:11,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 06:16:13,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:16:14,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=274893.3333333333, ans=0.125 2023-09-29 06:16:15,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:17,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:16:18,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:18,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:22,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:26,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:16:26,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:16:28,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=274960.0, ans=0.0 2023-09-29 06:16:30,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:16:30,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:16:32,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:35,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:38,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 06:16:38,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 06:16:39,937 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 06:16:40,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=275026.6666666667, ans=0.2 2023-09-29 06:16:41,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:16:48,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 06:16:50,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:16:55,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:58,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:17:00,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:00,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:17:03,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:17:06,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 06:17:06,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:17:08,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:08,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 06:17:11,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=275160.0, ans=0.2 2023-09-29 06:17:13,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:22,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 06:17:23,493 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.014e+02 2.168e+02 2.447e+02 3.458e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 06:17:23,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:17:23,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:17:27,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 06:17:27,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 06:17:27,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:28,553 INFO [train.py:1039] (0/4) Epoch 8, batch 4100, loss[loss=0.2492, simple_loss=0.3, pruned_loss=0.09918, over 23862.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2845, pruned_loss=0.07529, over 4719784.80 frames. ], batch size: 150, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:17:28,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:17:30,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:30,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:17:35,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=275226.6666666667, ans=0.125 2023-09-29 06:17:36,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 06:17:38,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 06:17:38,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 06:17:40,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 06:17:40,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:40,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:17:42,180 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 06:17:43,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:45,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:17:46,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:47,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:17:51,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:17:53,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:54,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:17:54,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 06:17:57,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:57,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:17:57,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:17:57,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:59,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 06:17:59,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=275293.3333333333, ans=0.125 2023-09-29 06:18:00,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:02,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 06:18:03,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:18:06,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:18:06,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 06:18:08,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:18:08,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:18:08,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:18:11,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 06:18:14,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:18:15,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:18:16,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 06:18:18,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:18:18,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:23,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:27,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:31,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:31,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:18:31,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=275426.6666666667, ans=0.125 2023-09-29 06:18:32,369 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.16 vs. limit=15.0 2023-09-29 06:18:42,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:18:42,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:45,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:47,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:18:50,473 INFO [train.py:1039] (0/4) Epoch 8, batch 4150, loss[loss=0.2031, simple_loss=0.2727, pruned_loss=0.06669, over 24670.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2856, pruned_loss=0.07581, over 4722195.05 frames. ], batch size: 65, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:18:50,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:52,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:18:54,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:18:54,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:18:57,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 06:18:58,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:58,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 06:19:00,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 06:19:00,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 06:19:02,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:19:08,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:19:08,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:09,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=275626.6666666667, ans=0.0 2023-09-29 06:19:13,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:14,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:14,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:19:16,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:19:16,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:19:17,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:19:21,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:25,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:28,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 06:19:30,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 06:19:30,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:19:30,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 06:19:30,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:19:30,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:33,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:34,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:39,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 06:19:42,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:19:43,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:19:45,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 06:19:45,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:46,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 06:19:47,231 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=275760.0, ans=0.125 2023-09-29 06:19:48,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:19:50,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:51,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:51,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 06:19:51,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:51,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:19:55,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:19:56,107 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.79 vs. limit=15.0 2023-09-29 06:19:56,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 06:19:56,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:56,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:19:56,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:19:58,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 06:20:00,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:20:00,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:20:00,730 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=275826.6666666667, ans=0.5 2023-09-29 06:20:01,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:03,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:20:03,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 06:20:03,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:20:03,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=275826.6666666667, ans=0.125 2023-09-29 06:20:09,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:20:11,682 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 2.232e+02 3.048e+02 3.830e+02 6.363e+02, threshold=6.096e+02, percent-clipped=13.0 2023-09-29 06:20:13,738 INFO [train.py:1039] (0/4) Epoch 8, batch 4200, loss[loss=0.1992, simple_loss=0.2441, pruned_loss=0.07709, over 22613.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2841, pruned_loss=0.07565, over 4720574.78 frames. ], batch size: 322, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:20:13,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 06:20:15,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:20:17,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:19,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:20:20,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:20,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:23,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 06:20:26,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 06:20:26,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:26,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=275893.3333333333, ans=0.2 2023-09-29 06:20:29,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:31,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:20:35,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:20:38,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:20:38,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:38,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 06:20:38,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:38,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:40,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:40,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:20:42,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:20:43,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 06:20:43,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:47,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=276026.6666666667, ans=0.2 2023-09-29 06:20:48,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:20:50,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:20:53,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:20:53,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:55,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:20:55,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 06:20:55,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:20:58,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:21:03,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:21:05,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:13,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:21:16,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 06:21:18,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:21:19,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=276160.0, ans=0.025 2023-09-29 06:21:23,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:21:25,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:28,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 06:21:32,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:21:35,631 INFO [train.py:1039] (0/4) Epoch 8, batch 4250, loss[loss=0.2079, simple_loss=0.2655, pruned_loss=0.07516, over 23340.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2828, pruned_loss=0.07503, over 4720308.44 frames. ], batch size: 119, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:21:36,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=276226.6666666667, ans=0.0 2023-09-29 06:21:37,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:37,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:21:41,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:45,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:21:47,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 06:21:47,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:21:52,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:55,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:21:59,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:21:59,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:02,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:22:02,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:03,291 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=12.0 2023-09-29 06:22:05,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:05,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:07,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:08,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:22:09,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=276360.0, ans=0.125 2023-09-29 06:22:10,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:12,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 06:22:17,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 06:22:17,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:17,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:17,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:18,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:22:20,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:20,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:24,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:22:25,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:22:25,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=276426.6666666667, ans=0.1 2023-09-29 06:22:28,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:29,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=276426.6666666667, ans=0.125 2023-09-29 06:22:30,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:32,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 06:22:32,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:22:33,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 06:22:35,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:22:35,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=276426.6666666667, ans=0.125 2023-09-29 06:22:36,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:22:38,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:38,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:41,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 06:22:43,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:22:43,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:22:46,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:50,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:51,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:22:53,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:53,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:22:55,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:22:56,724 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.004e+02 2.225e+02 2.717e+02 4.251e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 06:22:56,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:22:56,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 06:22:57,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:58,439 INFO [train.py:1039] (0/4) Epoch 8, batch 4300, loss[loss=0.2148, simple_loss=0.2899, pruned_loss=0.06985, over 24440.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2825, pruned_loss=0.07494, over 4717519.75 frames. ], batch size: 66, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:23:05,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:23:05,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:08,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:23:16,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:23:16,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 06:23:16,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:23:18,371 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.11 vs. limit=10.0 2023-09-29 06:23:20,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:23:20,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:23:20,711 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 06:23:23,515 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.67 vs. limit=22.5 2023-09-29 06:23:25,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:23:27,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:23:30,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 06:23:31,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:23:31,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 06:23:32,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:23:35,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:23:38,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:23:38,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:40,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:23:43,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:45,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:23:45,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 06:23:45,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 06:23:47,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:23:47,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=276760.0, ans=0.125 2023-09-29 06:23:51,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:23:51,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:51,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 06:23:51,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 06:23:51,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=276760.0, ans=0.2 2023-09-29 06:23:52,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 06:23:53,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:23:53,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 06:23:53,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 06:23:57,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:23:59,250 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 06:24:00,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:24:04,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:04,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:24:05,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 06:24:07,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:24:07,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:08,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:08,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:08,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:24:11,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:24:12,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=276826.6666666667, ans=0.0 2023-09-29 06:24:15,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:16,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:16,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:20,324 INFO [train.py:1039] (0/4) Epoch 8, batch 4350, loss[loss=0.2241, simple_loss=0.2889, pruned_loss=0.07968, over 23599.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2832, pruned_loss=0.07483, over 4719425.42 frames. ], batch size: 256, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:24:22,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 06:24:22,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:24:26,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:27,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=276893.3333333333, ans=0.125 2023-09-29 06:24:31,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:33,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:24:33,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:24:39,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:24:44,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:47,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:24:47,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:49,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:24:52,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:24:53,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:24:53,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=277026.6666666667, ans=0.1 2023-09-29 06:24:54,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=277026.6666666667, ans=0.2 2023-09-29 06:24:56,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=277026.6666666667, ans=0.1 2023-09-29 06:24:57,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 06:24:57,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:59,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:05,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:07,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 06:25:11,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:11,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:25:11,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=277093.3333333333, ans=0.125 2023-09-29 06:25:15,911 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 06:25:17,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:17,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:25:18,937 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 06:25:20,343 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 06:25:20,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:20,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:21,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:25:23,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:23,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:23,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:25:27,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 06:25:27,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:27,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:28,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:28,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 06:25:30,071 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 06:25:30,078 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 06:25:30,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 06:25:33,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:25:33,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:25:35,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:25:35,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:25:37,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=277160.0, ans=0.0 2023-09-29 06:25:37,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=277160.0, ans=0.05 2023-09-29 06:25:38,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 06:25:39,663 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.760e+02 2.095e+02 2.321e+02 2.736e+02 4.922e+02, threshold=4.641e+02, percent-clipped=1.0 2023-09-29 06:25:41,147 INFO [train.py:1039] (0/4) Epoch 8, batch 4400, loss[loss=0.2389, simple_loss=0.2953, pruned_loss=0.09122, over 23575.00 frames. ], tot_loss[loss=0.217, simple_loss=0.284, pruned_loss=0.07502, over 4732851.19 frames. ], batch size: 256, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:25:41,239 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 06:25:41,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:46,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:46,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:48,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:51,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 06:25:51,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 06:25:51,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 06:25:51,611 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 06:25:53,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:25:53,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:53,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=277226.6666666667, ans=0.125 2023-09-29 06:25:54,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 06:25:56,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:57,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:59,263 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 06:26:01,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:01,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 06:26:01,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=277293.3333333333, ans=0.1 2023-09-29 06:26:02,767 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 06:26:06,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 06:26:06,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 06:26:06,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 06:26:06,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:08,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:10,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 06:26:11,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 06:26:12,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:15,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:26:15,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:18,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:18,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:18,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 06:26:18,759 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 06:26:19,335 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.50 vs. limit=15.0 2023-09-29 06:26:23,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:27,097 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.67 vs. limit=6.0 2023-09-29 06:26:27,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=277360.0, ans=0.09899494936611666 2023-09-29 06:26:29,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:29,903 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.73 vs. limit=10.0 2023-09-29 06:26:33,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 06:26:36,657 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.99 vs. limit=22.5 2023-09-29 06:26:38,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:26:40,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:26:42,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:26:44,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 06:26:44,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:26:44,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:26:44,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:26:45,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:26:50,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 06:26:54,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 06:26:55,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 06:26:56,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:56,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 06:26:56,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:26:59,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:27:00,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 06:27:02,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=277560.0, ans=0.0 2023-09-29 06:27:03,750 INFO [train.py:1039] (0/4) Epoch 8, batch 4450, loss[loss=0.1769, simple_loss=0.251, pruned_loss=0.05145, over 24278.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2846, pruned_loss=0.07517, over 4724418.61 frames. ], batch size: 56, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:27:03,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:27:06,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:08,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:27:11,250 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.49 vs. limit=22.5 2023-09-29 06:27:15,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:16,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:27:20,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:22,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:27:27,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:27:27,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:27,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 06:27:27,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:27,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:27,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:27:27,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:27:30,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:27:36,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:36,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:38,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:38,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:40,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:27:45,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:27:46,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 06:27:46,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 06:27:46,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:27:51,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:53,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 06:27:56,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:28:00,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:02,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 06:28:02,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:02,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:02,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:28:02,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:28:05,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:08,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:28:08,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 06:28:08,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=277826.6666666667, ans=0.0 2023-09-29 06:28:10,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:28:11,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:13,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:14,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:16,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:28:19,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:28:21,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 06:28:24,955 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.112e+02 2.512e+02 3.151e+02 6.272e+02, threshold=5.024e+02, percent-clipped=2.0 2023-09-29 06:28:25,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:28:26,485 INFO [train.py:1039] (0/4) Epoch 8, batch 4500, loss[loss=0.3115, simple_loss=0.3424, pruned_loss=0.1403, over 19982.00 frames. ], tot_loss[loss=0.2181, simple_loss=0.2852, pruned_loss=0.07551, over 4704480.55 frames. ], batch size: 388, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:28:28,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:28,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=277893.3333333333, ans=0.125 2023-09-29 06:28:29,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 06:28:29,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 06:28:32,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:39,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:39,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:41,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:28:41,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:28:41,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:42,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:46,636 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:28:53,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:55,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:28:57,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:57,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:28:59,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:29:02,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=278026.6666666667, ans=0.125 2023-09-29 06:29:05,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:29:11,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:29:15,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:29:17,772 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.14 vs. limit=15.0 2023-09-29 06:29:18,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:29:20,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 06:29:20,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:21,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:21,971 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=278093.3333333333, ans=0.2 2023-09-29 06:29:23,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:23,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:29:26,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:29:26,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 06:29:26,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:29:26,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:28,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=278093.3333333333, ans=0.125 2023-09-29 06:29:31,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:29:31,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:29:34,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:37,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:29:37,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:29:39,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 06:29:39,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 06:29:39,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 06:29:45,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 06:29:48,373 INFO [train.py:1039] (0/4) Epoch 8, batch 4550, loss[loss=0.2158, simple_loss=0.2673, pruned_loss=0.08217, over 23754.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2831, pruned_loss=0.07482, over 4709191.21 frames. ], batch size: 232, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:29:48,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 06:29:49,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:29:53,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:53,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:56,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:00,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:30:01,501 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.98 vs. limit=15.0 2023-09-29 06:30:04,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:30:06,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:06,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:30:06,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:09,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:09,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:30:12,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:16,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 06:30:18,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 06:30:18,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:30:21,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 06:30:22,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 06:30:24,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:27,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 06:30:29,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:30:32,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:30:32,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=278360.0, ans=0.125 2023-09-29 06:30:35,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 06:30:39,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:41,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:41,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:44,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:44,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 06:30:44,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 06:30:45,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:30:45,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 06:30:48,741 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.14 vs. limit=15.0 2023-09-29 06:30:49,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 06:30:49,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:51,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:51,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:53,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:53,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:30:55,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:30:55,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 06:30:56,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:56,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:30:58,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 06:30:58,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:30:58,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 06:31:01,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:31:01,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:31:04,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:31:04,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:31:04,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:31:05,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:31:09,475 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.931e+02 2.102e+02 2.382e+02 3.783e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 06:31:09,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:31:11,126 INFO [train.py:1039] (0/4) Epoch 8, batch 4600, loss[loss=0.2301, simple_loss=0.2844, pruned_loss=0.08789, over 23896.00 frames. ], tot_loss[loss=0.2155, simple_loss=0.282, pruned_loss=0.07451, over 4713041.00 frames. ], batch size: 195, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:31:11,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:12,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:31:16,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:31:16,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:31:16,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:16,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 06:31:18,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=278560.0, ans=0.1 2023-09-29 06:31:19,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:31:24,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:31:24,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:27,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:34,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 06:31:35,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:36,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=278626.6666666667, ans=0.125 2023-09-29 06:31:38,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:40,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:31:42,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:49,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 06:31:49,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:31:50,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:31:55,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:55,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:31:58,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:32:01,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 06:32:02,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:32:07,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:09,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:32:12,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:12,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 06:32:12,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:12,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 06:32:13,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:13,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=278760.0, ans=0.0 2023-09-29 06:32:15,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:15,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=278826.6666666667, ans=0.05 2023-09-29 06:32:18,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:18,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:32:18,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:20,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 06:32:20,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 06:32:20,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 06:32:20,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:21,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:21,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:24,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:34,342 INFO [train.py:1039] (0/4) Epoch 8, batch 4650, loss[loss=0.1998, simple_loss=0.2689, pruned_loss=0.06536, over 24597.00 frames. ], tot_loss[loss=0.2157, simple_loss=0.282, pruned_loss=0.07475, over 4703220.41 frames. ], batch size: 60, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:32:35,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:32:38,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:40,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:40,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:32:41,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:41,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:43,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:46,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 06:32:50,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:32:53,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 06:32:53,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:53,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 06:32:54,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:32:54,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 06:32:54,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 06:32:54,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:56,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:32:57,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:33:00,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=278960.0, ans=0.125 2023-09-29 06:33:01,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:01,473 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 06:33:04,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:06,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 06:33:08,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:08,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:33:11,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 06:33:13,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:33:16,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:33:18,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:24,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:27,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:28,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:28,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:33:28,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=279093.3333333333, ans=0.0 2023-09-29 06:33:30,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=279093.3333333333, ans=0.125 2023-09-29 06:33:31,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 06:33:31,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 06:33:32,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 06:33:32,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 06:33:33,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:39,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:33:39,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:33:39,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 06:33:39,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:43,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:43,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:33:43,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:33:46,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:33:46,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:47,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:52,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:52,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:33:52,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:33:53,551 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 2.046e+02 2.215e+02 2.491e+02 3.733e+02, threshold=4.429e+02, percent-clipped=0.0 2023-09-29 06:33:53,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 06:33:55,132 INFO [train.py:1039] (0/4) Epoch 8, batch 4700, loss[loss=0.2175, simple_loss=0.291, pruned_loss=0.072, over 24348.00 frames. ], tot_loss[loss=0.217, simple_loss=0.2831, pruned_loss=0.07549, over 4702127.55 frames. ], batch size: 77, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:33:55,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:33:57,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 06:34:05,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:07,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:34:07,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:34:09,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:09,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:34:09,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=279226.6666666667, ans=0.0 2023-09-29 06:34:16,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 06:34:17,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 06:34:19,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:19,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=279293.3333333333, ans=0.125 2023-09-29 06:34:22,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:34:22,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:34:23,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:28,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:34:29,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:34:33,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:33,833 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=279360.0, ans=0.2 2023-09-29 06:34:43,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 06:34:44,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:34:46,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:51,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 06:34:53,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:34:56,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:34:57,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 06:34:59,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:59,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:02,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:35:02,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:35:02,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 06:35:02,422 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 06:35:05,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:07,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 06:35:08,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:12,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 06:35:15,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:35:16,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:17,393 INFO [train.py:1039] (0/4) Epoch 8, batch 4750, loss[loss=0.217, simple_loss=0.2826, pruned_loss=0.0757, over 23809.00 frames. ], tot_loss[loss=0.217, simple_loss=0.2834, pruned_loss=0.07529, over 4715213.70 frames. ], batch size: 179, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:35:19,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=279560.0, ans=0.125 2023-09-29 06:35:21,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:21,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:35:24,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 06:35:24,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:35:24,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=279560.0, ans=0.125 2023-09-29 06:35:27,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 06:35:29,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:35:30,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:31,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:38,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 06:35:41,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=279626.6666666667, ans=0.125 2023-09-29 06:35:41,251 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=279626.6666666667, ans=0.125 2023-09-29 06:35:42,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:35:45,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 06:35:46,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:50,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:51,645 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 06:35:51,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 06:36:00,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 06:36:03,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=279693.3333333333, ans=0.0 2023-09-29 06:36:04,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:06,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:09,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:36:09,710 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 06:36:09,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:12,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:36:13,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=279760.0, ans=0.125 2023-09-29 06:36:15,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:36:16,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 06:36:17,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 06:36:17,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:36:18,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:36:18,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:20,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:36:20,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=279826.6666666667, ans=0.0 2023-09-29 06:36:22,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 06:36:22,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 06:36:22,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=279826.6666666667, ans=0.0 2023-09-29 06:36:25,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:27,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:36:27,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 06:36:27,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:36:29,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:30,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:36:33,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:34,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:36:38,016 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.176e+02 2.410e+02 2.744e+02 3.912e+02, threshold=4.820e+02, percent-clipped=0.0 2023-09-29 06:36:38,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:38,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=279893.3333333333, ans=0.0 2023-09-29 06:36:39,652 INFO [train.py:1039] (0/4) Epoch 8, batch 4800, loss[loss=0.2098, simple_loss=0.2677, pruned_loss=0.07598, over 23362.00 frames. ], tot_loss[loss=0.217, simple_loss=0.2839, pruned_loss=0.07502, over 4735692.55 frames. ], batch size: 134, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:36:39,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 06:36:41,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 06:36:42,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 06:36:44,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:36:44,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:45,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 06:36:51,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:52,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=279893.3333333333, ans=0.0 2023-09-29 06:36:53,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:59,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:36:59,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:59,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:01,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 06:37:01,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:37:01,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=279960.0, ans=0.0 2023-09-29 06:37:03,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:37:04,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:37:08,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:10,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:10,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:37:12,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:12,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:37:12,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:12,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:16,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:18,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:37:21,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:37:23,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:24,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 06:37:24,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 06:37:26,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:26,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:37:26,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:37:26,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:26,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:37:29,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:37:29,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:33,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:36,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:39,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:43,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 06:37:43,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:45,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:45,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:37:46,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:48,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=280160.0, ans=0.0 2023-09-29 06:37:48,853 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.38 vs. limit=10.0 2023-09-29 06:37:49,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:50,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:37:50,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:51,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:37:51,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:37:53,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:37:54,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:56,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:56,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:57,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 06:38:00,675 INFO [train.py:1039] (0/4) Epoch 8, batch 4850, loss[loss=0.2163, simple_loss=0.2856, pruned_loss=0.07347, over 23197.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2852, pruned_loss=0.07636, over 4722169.61 frames. ], batch size: 105, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:38:00,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 06:38:00,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:00,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:00,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:00,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:05,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:38:13,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 06:38:16,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:21,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:22,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:38:22,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:26,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:26,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=280293.3333333333, ans=0.0 2023-09-29 06:38:26,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=280293.3333333333, ans=0.125 2023-09-29 06:38:29,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:38:29,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:38:29,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 06:38:30,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=280293.3333333333, ans=0.2 2023-09-29 06:38:32,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:35,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:38:37,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:38:37,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:38:37,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 06:38:38,405 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.92 vs. limit=15.0 2023-09-29 06:38:40,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:40,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 06:38:45,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 06:38:46,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:38:55,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:38:55,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 06:38:57,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:57,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:38:58,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:39:01,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 06:39:01,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:03,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 06:39:03,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:03,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:03,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=280426.6666666667, ans=0.125 2023-09-29 06:39:04,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 06:39:14,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:19,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:39:19,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:23,647 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.212e+02 2.561e+02 3.191e+02 4.940e+02, threshold=5.123e+02, percent-clipped=1.0 2023-09-29 06:39:23,689 INFO [train.py:1039] (0/4) Epoch 8, batch 4900, loss[loss=0.2189, simple_loss=0.2593, pruned_loss=0.08922, over 19125.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.2849, pruned_loss=0.07636, over 4726159.59 frames. ], batch size: 388, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:39:25,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 06:39:25,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:39:30,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:32,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:33,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:39:37,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 06:39:41,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 06:39:45,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 06:39:46,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 06:39:46,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:39:48,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:48,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:39:48,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:49,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:39:49,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 06:39:52,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 06:39:52,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:39:54,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:39:54,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:39:55,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=280693.3333333333, ans=15.0 2023-09-29 06:40:00,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:40:00,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:01,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:01,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 06:40:03,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:40:04,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:40:04,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 06:40:06,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 06:40:08,787 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.28 vs. limit=15.0 2023-09-29 06:40:09,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 06:40:11,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:40:12,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:12,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:40:12,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:14,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:40:14,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:40:14,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 06:40:18,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:19,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:40:19,971 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=280760.0, ans=0.2 2023-09-29 06:40:21,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:40:24,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 06:40:24,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:40:24,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 06:40:24,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 06:40:35,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:36,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:40:38,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 06:40:38,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:38,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:40:41,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:44,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:40:44,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:40:44,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:44,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:40:45,617 INFO [train.py:1039] (0/4) Epoch 8, batch 4950, loss[loss=0.2266, simple_loss=0.3058, pruned_loss=0.07376, over 24474.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2834, pruned_loss=0.07584, over 4717279.21 frames. ], batch size: 69, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:40:45,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:40:47,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=280893.3333333333, ans=0.1 2023-09-29 06:40:49,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:40:50,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:53,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 06:40:54,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 06:40:54,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:40:55,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 06:40:55,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:55,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:55,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:40:57,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:40:58,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:58,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:41:01,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:41:01,948 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.88 vs. limit=15.0 2023-09-29 06:41:02,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:41:04,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:04,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:41:07,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:41:08,103 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=280960.0, ans=0.0 2023-09-29 06:41:11,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=280960.0, ans=0.1 2023-09-29 06:41:12,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:14,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:41:16,604 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.50 vs. limit=6.0 2023-09-29 06:41:17,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:17,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:18,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:41:20,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 06:41:22,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 06:41:24,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:27,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:41:27,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:41:27,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=281026.6666666667, ans=0.125 2023-09-29 06:41:28,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:41:28,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:41:28,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:41:30,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:32,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:41:35,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:41:39,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:39,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:39,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 06:41:40,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:41:41,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:41:43,641 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.94 vs. limit=15.0 2023-09-29 06:41:45,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:41:47,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:41:47,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:41:48,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:48,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:41:50,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:41:51,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:41:51,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:41:51,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:55,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 06:41:59,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:05,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 06:42:05,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:42:07,486 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.069e+02 2.336e+02 2.676e+02 4.238e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 06:42:07,528 INFO [train.py:1039] (0/4) Epoch 8, batch 5000, loss[loss=0.2234, simple_loss=0.302, pruned_loss=0.07243, over 24566.00 frames. ], tot_loss[loss=0.2168, simple_loss=0.2828, pruned_loss=0.07537, over 4720359.13 frames. ], batch size: 71, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:42:12,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=281226.6666666667, ans=0.2 2023-09-29 06:42:13,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:42:14,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:15,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 06:42:16,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 06:42:17,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:42:20,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 06:42:20,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:42:20,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:42:23,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 06:42:23,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:25,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:25,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 06:42:25,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:25,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:42:28,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 06:42:29,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 06:42:29,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:42:31,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 06:42:31,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:42:31,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:31,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:42:31,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 06:42:31,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=281293.3333333333, ans=0.125 2023-09-29 06:42:33,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 06:42:34,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 06:42:34,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:34,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:36,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 06:42:36,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:39,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:40,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:41,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:42:43,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 06:42:44,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:42:48,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:42:48,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=281360.0, ans=0.5 2023-09-29 06:42:52,093 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 06:42:55,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:56,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:56,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:42:59,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 06:42:59,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:43:01,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:01,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:04,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 06:43:04,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:07,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:09,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:12,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=281493.3333333333, ans=0.125 2023-09-29 06:43:13,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 06:43:14,275 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=281493.3333333333, ans=0.125 2023-09-29 06:43:15,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=281493.3333333333, ans=0.5 2023-09-29 06:43:17,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:25,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=281493.3333333333, ans=0.0 2023-09-29 06:43:26,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:28,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:28,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:43:29,903 INFO [train.py:1039] (0/4) Epoch 8, batch 5050, loss[loss=0.2257, simple_loss=0.2834, pruned_loss=0.08401, over 22804.00 frames. ], tot_loss[loss=0.2166, simple_loss=0.283, pruned_loss=0.0751, over 4724592.57 frames. ], batch size: 322, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:43:29,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:30,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:43:30,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:43:30,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:33,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:34,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 06:43:36,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:43:37,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:40,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:43:40,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 06:43:41,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:41,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:44,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:43:46,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:43:46,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:43:55,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 06:43:55,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:43:57,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:43:58,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 06:43:58,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:01,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:01,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:03,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:03,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 06:44:03,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 06:44:05,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:06,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=281693.3333333333, ans=0.2 2023-09-29 06:44:08,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:11,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:11,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 06:44:14,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:17,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 06:44:18,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:44:19,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:44:21,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:21,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:44:22,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:44:24,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:44:26,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:26,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:44:26,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:44:26,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 06:44:28,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:44:30,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:32,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=281760.0, ans=0.125 2023-09-29 06:44:34,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:35,020 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 06:44:35,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:44:37,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:44:38,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:38,588 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 06:44:41,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:41,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 06:44:41,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:43,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=281826.6666666667, ans=0.95 2023-09-29 06:44:44,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:46,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:46,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 06:44:48,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 06:44:50,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:51,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:44:51,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:52,963 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.260e+02 2.527e+02 2.886e+02 4.203e+02, threshold=5.054e+02, percent-clipped=0.0 2023-09-29 06:44:53,005 INFO [train.py:1039] (0/4) Epoch 8, batch 5100, loss[loss=0.2154, simple_loss=0.3005, pruned_loss=0.06512, over 24297.00 frames. ], tot_loss[loss=0.2169, simple_loss=0.2837, pruned_loss=0.07509, over 4718944.69 frames. ], batch size: 74, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:44:53,317 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 06:44:56,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:59,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 06:44:59,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 06:44:59,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:02,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:45:06,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:45:06,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 06:45:06,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 06:45:13,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:45:13,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:45:18,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:21,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 06:45:22,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:24,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:45:24,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:45:26,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 06:45:31,463 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 06:45:32,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:33,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 06:45:33,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 06:45:34,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=282026.6666666667, ans=0.125 2023-09-29 06:45:34,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=282026.6666666667, ans=0.125 2023-09-29 06:45:36,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:41,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=282093.3333333333, ans=0.125 2023-09-29 06:45:46,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:45:49,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 06:45:49,496 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 06:45:49,519 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 06:45:51,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 06:45:52,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:55,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 06:46:00,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 06:46:02,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:46:03,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:46:05,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 06:46:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:46:08,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 06:46:11,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=282160.0, ans=0.0 2023-09-29 06:46:13,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:46:13,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:46:13,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:46:15,131 INFO [train.py:1039] (0/4) Epoch 8, batch 5150, loss[loss=0.2313, simple_loss=0.3083, pruned_loss=0.07714, over 23955.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.2853, pruned_loss=0.07618, over 4708398.56 frames. ], batch size: 80, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:46:15,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:46:15,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:46:17,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:46:18,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 06:46:18,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 06:46:18,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 06:46:18,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:46:18,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 06:46:19,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:21,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 06:46:22,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:24,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:28,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:46:28,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 06:46:30,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:30,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:46:32,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:46:32,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:46:32,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:46:33,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:46:33,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:46:33,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 06:46:37,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:46:37,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:46:39,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:46:41,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 06:46:42,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:46:45,314 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.80 vs. limit=22.5 2023-09-29 06:46:49,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:46:52,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 06:46:53,064 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.11 vs. limit=22.5 2023-09-29 06:46:57,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:00,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=282360.0, ans=0.125 2023-09-29 06:47:03,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:03,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:06,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:08,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:11,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 06:47:16,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:47:19,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:47:19,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:47:21,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:21,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=282493.3333333333, ans=0.125 2023-09-29 06:47:23,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:24,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 06:47:25,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=282493.3333333333, ans=0.125 2023-09-29 06:47:29,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:29,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:47:31,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:31,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:47:33,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:47:33,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:47:33,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:47:34,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=282493.3333333333, ans=0.125 2023-09-29 06:47:35,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:47:37,010 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=282560.0, ans=0.1 2023-09-29 06:47:38,005 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.091e+02 2.433e+02 2.751e+02 4.119e+02, threshold=4.867e+02, percent-clipped=0.0 2023-09-29 06:47:38,050 INFO [train.py:1039] (0/4) Epoch 8, batch 5200, loss[loss=0.2374, simple_loss=0.2838, pruned_loss=0.09554, over 23797.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.2852, pruned_loss=0.07586, over 4721272.69 frames. ], batch size: 212, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:47:39,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:47:41,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:47:44,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:50,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 06:47:50,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:47:51,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:52,489 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.33 vs. limit=22.5 2023-09-29 06:47:54,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:56,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:47:56,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:59,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 06:47:59,662 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=282626.6666666667, ans=0.0 2023-09-29 06:48:01,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:48:03,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:05,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 06:48:08,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:48:09,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:48:11,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 06:48:11,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 06:48:13,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 06:48:14,381 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.76 vs. limit=22.5 2023-09-29 06:48:14,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:14,782 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 06:48:14,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:48:16,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:16,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:48:17,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 06:48:19,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:48:21,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:25,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 06:48:25,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 06:48:25,538 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:48:26,203 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.98 vs. limit=22.5 2023-09-29 06:48:26,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 06:48:27,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=282760.0, ans=0.2 2023-09-29 06:48:28,680 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=282760.0, ans=0.1 2023-09-29 06:48:29,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 06:48:29,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:48:33,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=282760.0, ans=0.125 2023-09-29 06:48:37,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:48:37,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:39,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 06:48:41,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:41,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 06:48:41,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:41,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:48:44,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:45,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:48:46,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=282826.6666666667, ans=0.07 2023-09-29 06:48:51,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:51,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:48:51,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:51,486 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=282826.6666666667, ans=0.1 2023-09-29 06:48:55,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:58,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 06:48:58,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:58,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:49:00,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:01,423 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.45 vs. limit=6.0 2023-09-29 06:49:01,692 INFO [train.py:1039] (0/4) Epoch 8, batch 5250, loss[loss=0.1902, simple_loss=0.2661, pruned_loss=0.05719, over 24317.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.284, pruned_loss=0.07577, over 4725819.61 frames. ], batch size: 61, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:49:01,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:49:03,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:49:04,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:49:08,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:08,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:49:10,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:49:10,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=282893.3333333333, ans=0.125 2023-09-29 06:49:16,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:49:18,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:49:21,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:49:21,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:49:24,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 06:49:24,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:26,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:38,836 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=283026.6666666667, ans=0.125 2023-09-29 06:49:45,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=283026.6666666667, ans=0.1 2023-09-29 06:50:01,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=283160.0, ans=0.125 2023-09-29 06:50:16,296 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.079e+02 2.357e+02 2.633e+02 5.213e+02, threshold=4.714e+02, percent-clipped=2.0 2023-09-29 06:50:16,339 INFO [train.py:1039] (0/4) Epoch 8, batch 5300, loss[loss=0.2047, simple_loss=0.2874, pruned_loss=0.06098, over 24652.00 frames. ], tot_loss[loss=0.2169, simple_loss=0.2828, pruned_loss=0.07552, over 4725832.30 frames. ], batch size: 73, lr: 1.25e-02, grad_scale: 32.0 2023-09-29 06:50:31,650 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-8.pt 2023-09-29 06:50:38,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:50:39,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 06:50:39,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 06:50:39,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:39,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:39,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:39,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:39,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:39,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:50:39,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:39,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:50:40,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:50:40,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 06:50:40,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 06:50:40,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 06:50:40,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:50:40,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 06:50:41,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 06:50:41,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:42,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:42,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:42,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:42,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:50:42,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:42,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:42,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:43,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:43,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:43,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:50:43,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:43,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:50:44,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 06:50:44,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:44,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:44,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 06:50:44,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 06:50:44,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:50:44,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:50:44,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 06:50:45,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 06:50:45,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:46,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:50:46,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:46,769 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 06:50:46,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 06:50:46,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:50:47,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:47,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 06:50:47,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 06:50:47,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 06:50:47,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:50,877 INFO [train.py:1039] (0/4) Epoch 9, batch 0, loss[loss=0.2388, simple_loss=0.2958, pruned_loss=0.09089, over 23834.00 frames. ], tot_loss[loss=0.2388, simple_loss=0.2958, pruned_loss=0.09089, over 23834.00 frames. ], batch size: 179, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:50:50,878 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 06:51:04,818 INFO [train.py:1071] (0/4) Epoch 9, validation: loss=0.2824, simple_loss=0.2767, pruned_loss=0.144, over 1125622.00 frames. 2023-09-29 06:51:04,819 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 06:51:06,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 06:51:06,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:51:08,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:51:09,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=283306.6666666667, ans=0.125 2023-09-29 06:51:14,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:14,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:51:14,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:16,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 06:51:17,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 06:51:19,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:20,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:26,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:51:26,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:29,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 06:51:29,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=283373.3333333333, ans=0.125 2023-09-29 06:51:30,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:40,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:51:40,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:42,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 06:51:45,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:51:47,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:51:48,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:51:51,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:51:51,381 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=283440.0, ans=0.0 2023-09-29 06:51:52,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=283506.6666666667, ans=0.05 2023-09-29 06:51:56,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:02,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 06:52:06,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 06:52:06,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:06,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:06,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:52:06,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:52:10,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 06:52:13,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:13,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:17,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:52:20,490 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 06:52:23,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:52:26,929 INFO [train.py:1039] (0/4) Epoch 9, batch 50, loss[loss=0.1914, simple_loss=0.2706, pruned_loss=0.0561, over 24472.00 frames. ], tot_loss[loss=0.2195, simple_loss=0.2873, pruned_loss=0.0758, over 1073840.76 frames. ], batch size: 63, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:52:27,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:28,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:28,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 06:52:30,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:52:30,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:52:31,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:34,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:36,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:38,655 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=283640.0, ans=0.1 2023-09-29 06:52:41,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 06:52:41,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:45,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=283706.6666666667, ans=0.2 2023-09-29 06:52:48,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:52:49,180 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.71 vs. limit=15.0 2023-09-29 06:52:50,277 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.89 vs. limit=10.0 2023-09-29 06:52:51,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 06:52:53,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 06:52:56,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:52:57,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:52:57,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:59,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:53:01,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:53:01,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:53:01,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:53:10,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:12,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:12,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:53:14,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 06:53:15,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:53:16,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=283840.0, ans=0.1 2023-09-29 06:53:17,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:53:17,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 06:53:17,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:19,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 06:53:22,472 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=283840.0, ans=0.125 2023-09-29 06:53:27,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:53:27,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:28,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:30,301 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 2.133e+02 2.436e+02 2.893e+02 4.514e+02, threshold=4.872e+02, percent-clipped=0.0 2023-09-29 06:53:30,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:53:30,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:34,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 06:53:34,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 06:53:36,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:36,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:36,617 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=283906.6666666667, ans=0.125 2023-09-29 06:53:37,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:53:39,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:39,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 06:53:39,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 06:53:40,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:53:42,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:43,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:53:43,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 06:53:43,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 06:53:46,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:47,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:48,935 INFO [train.py:1039] (0/4) Epoch 9, batch 100, loss[loss=0.2121, simple_loss=0.2813, pruned_loss=0.07145, over 16236.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.2873, pruned_loss=0.07485, over 1882466.01 frames. ], batch size: 35, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:53:49,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:53:49,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:53:52,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:53:56,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:53:57,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=283973.3333333333, ans=0.125 2023-09-29 06:54:02,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:03,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 06:54:03,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:54:03,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=284040.0, ans=0.0 2023-09-29 06:54:04,164 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.76 vs. limit=12.0 2023-09-29 06:54:06,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:54:06,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:06,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:54:08,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:54:08,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:10,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 06:54:13,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:54:14,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:14,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:14,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:18,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 06:54:20,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:21,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:21,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:54:23,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:54:23,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=284106.6666666667, ans=0.125 2023-09-29 06:54:28,336 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 06:54:28,360 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 06:54:29,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:54:29,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:54:31,729 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=284106.6666666667, ans=0.125 2023-09-29 06:54:34,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:54:36,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=284173.3333333333, ans=0.125 2023-09-29 06:54:38,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:39,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:39,836 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=284173.3333333333, ans=0.125 2023-09-29 06:54:42,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=284173.3333333333, ans=0.2 2023-09-29 06:54:44,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:44,207 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 06:54:45,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:54:49,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:54:51,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:54:51,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:54,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:54:58,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:00,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:55:03,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:03,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=284240.0, ans=0.0 2023-09-29 06:55:04,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:04,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:04,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:55:04,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:06,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 06:55:06,270 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 06:55:06,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:08,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:55:08,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:08,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:10,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:55:10,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:55:11,544 INFO [train.py:1039] (0/4) Epoch 9, batch 150, loss[loss=0.2115, simple_loss=0.2702, pruned_loss=0.07635, over 23454.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.285, pruned_loss=0.07492, over 2506216.19 frames. ], batch size: 134, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:55:11,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:55:11,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:11,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:13,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:14,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:55:14,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:55:16,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:21,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:21,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:23,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:24,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=284306.6666666667, ans=0.04949747468305833 2023-09-29 06:55:26,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:26,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:30,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:55:30,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:34,613 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.31 vs. limit=15.0 2023-09-29 06:55:35,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 06:55:35,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 06:55:35,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 06:55:37,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:55:37,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:55:39,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:55:41,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:41,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:41,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,954 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 06:55:45,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:51,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:56,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:55:58,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 06:56:03,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:56:03,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:56:03,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:06,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:56:08,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:56:08,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:56:10,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:11,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 06:56:16,141 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 2.032e+02 2.478e+02 3.173e+02 5.553e+02, threshold=4.955e+02, percent-clipped=3.0 2023-09-29 06:56:16,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:16,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:17,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:56:17,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:56:18,543 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.13 vs. limit=15.0 2023-09-29 06:56:20,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:23,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:56:26,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:56:26,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:56:27,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:29,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:56:29,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 06:56:30,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:30,793 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 06:56:34,637 INFO [train.py:1039] (0/4) Epoch 9, batch 200, loss[loss=0.2263, simple_loss=0.2843, pruned_loss=0.08413, over 23535.00 frames. ], tot_loss[loss=0.218, simple_loss=0.2853, pruned_loss=0.07529, over 2995313.91 frames. ], batch size: 134, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:56:36,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:39,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:56:39,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:56:42,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 06:56:44,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:44,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:47,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 06:56:49,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:56:49,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=284706.6666666667, ans=0.125 2023-09-29 06:56:50,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:52,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:52,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=284706.6666666667, ans=0.2 2023-09-29 06:56:56,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:56:57,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:57,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:14,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:57:14,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:57:16,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:57:17,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:57:19,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:57:19,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:57:20,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:22,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:57:23,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:23,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:25,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 06:57:25,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:57:25,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:28,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:57:36,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:37,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=284840.0, ans=0.125 2023-09-29 06:57:38,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=284906.6666666667, ans=0.1 2023-09-29 06:57:44,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:44,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:57:52,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:55,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 06:57:55,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:56,911 INFO [train.py:1039] (0/4) Epoch 9, batch 250, loss[loss=0.2076, simple_loss=0.2794, pruned_loss=0.06792, over 24468.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2849, pruned_loss=0.0753, over 3380310.01 frames. ], batch size: 63, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:57:56,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:57:56,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:58,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:57:58,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 06:58:00,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:00,210 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 06:58:01,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:02,452 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.30 vs. limit=15.0 2023-09-29 06:58:05,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:58:06,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:06,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:58:08,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:58:08,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=284973.3333333333, ans=0.125 2023-09-29 06:58:09,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:11,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:58:14,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:58:16,984 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.60 vs. limit=22.5 2023-09-29 06:58:20,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=285040.0, ans=0.125 2023-09-29 06:58:23,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=285040.0, ans=0.125 2023-09-29 06:58:25,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:28,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:58:28,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:58:34,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:58:36,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:58:36,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:58:37,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:39,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:58:39,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:58:39,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:41,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:58:41,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=285106.6666666667, ans=0.125 2023-09-29 06:58:42,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 06:58:43,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:45,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:58:46,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:58:46,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:58:46,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:58:48,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:58:48,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:58:50,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:53,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:58:53,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:58:58,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:59:01,252 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.006e+02 2.267e+02 2.549e+02 3.617e+02, threshold=4.534e+02, percent-clipped=0.0 2023-09-29 06:59:04,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:04,694 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:07,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:59:09,344 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:12,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:12,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:59:16,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 06:59:18,729 INFO [train.py:1039] (0/4) Epoch 9, batch 300, loss[loss=0.2297, simple_loss=0.2784, pruned_loss=0.09054, over 23670.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.2825, pruned_loss=0.07345, over 3687581.78 frames. ], batch size: 232, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:59:18,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:59:18,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:59:20,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 06:59:22,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:59:22,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:59:22,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 06:59:27,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:29,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:59:34,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:59:34,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 06:59:36,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:37,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:59:37,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 06:59:37,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:38,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=285373.3333333333, ans=0.125 2023-09-29 06:59:40,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:59:46,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:59:46,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 06:59:50,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 06:59:52,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:53,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:55,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:55,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 06:59:55,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:59:57,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:00:00,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:00:00,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=285440.0, ans=0.125 2023-09-29 07:00:02,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:04,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=285440.0, ans=0.95 2023-09-29 07:00:07,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:00:07,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 07:00:07,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:00:10,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:12,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 07:00:14,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:19,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:00:22,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:00:22,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 07:00:25,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:25,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:00:27,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:28,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:00:30,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 07:00:30,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:00:32,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:32,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 07:00:32,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=285573.3333333333, ans=0.0 2023-09-29 07:00:32,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=285573.3333333333, ans=0.0 2023-09-29 07:00:32,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=285573.3333333333, ans=0.1 2023-09-29 07:00:35,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:35,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:35,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:35,834 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:00:37,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:37,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:42,580 INFO [train.py:1039] (0/4) Epoch 9, batch 350, loss[loss=0.2041, simple_loss=0.2863, pruned_loss=0.061, over 24580.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2801, pruned_loss=0.07269, over 3917851.70 frames. ], batch size: 71, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:00:44,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:00:44,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:00:47,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:49,820 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.80 vs. limit=5.0 2023-09-29 07:00:50,585 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=285640.0, ans=0.1 2023-09-29 07:00:53,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:55,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:56,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:58,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 07:01:00,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:00,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 07:01:02,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:03,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 07:01:05,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:08,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 07:01:09,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:01:12,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:14,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:01:15,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:15,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:16,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:16,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:16,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:01:18,905 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.14 vs. limit=15.0 2023-09-29 07:01:19,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:01:19,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:25,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=285773.3333333333, ans=0.0 2023-09-29 07:01:27,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:01:27,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:01:27,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:01:29,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:35,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 07:01:35,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:40,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:40,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:01:40,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:42,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 07:01:45,354 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.981e+02 2.324e+02 2.780e+02 5.402e+02, threshold=4.648e+02, percent-clipped=1.0 2023-09-29 07:01:45,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:47,010 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 07:01:47,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 07:01:47,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:51,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:51,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 07:01:54,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:57,939 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.32 vs. limit=15.0 2023-09-29 07:01:58,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:01:59,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:00,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:01,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:03,959 INFO [train.py:1039] (0/4) Epoch 9, batch 400, loss[loss=0.2118, simple_loss=0.2889, pruned_loss=0.06734, over 24330.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2794, pruned_loss=0.07182, over 4098265.94 frames. ], batch size: 77, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:02:04,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:07,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:02:08,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:02:11,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 07:02:11,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:11,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:12,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=285973.3333333333, ans=0.0 2023-09-29 07:02:13,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:02:13,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:16,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:18,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:20,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 07:02:21,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 07:02:21,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:23,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 07:02:25,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:02:28,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:28,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 07:02:28,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:02:28,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:29,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:30,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=286040.0, ans=10.0 2023-09-29 07:02:31,535 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 07:02:31,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 07:02:36,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:38,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:38,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 07:02:39,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 07:02:44,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:02:46,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:02:52,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 07:02:58,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:02:59,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 07:03:01,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:03:02,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:03:02,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 07:03:06,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:03:09,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:03:11,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:03:11,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=286240.0, ans=0.0 2023-09-29 07:03:14,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:16,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 07:03:19,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:03:19,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 07:03:20,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:03:20,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:03:21,666 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.70 vs. limit=12.0 2023-09-29 07:03:22,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 07:03:25,687 INFO [train.py:1039] (0/4) Epoch 9, batch 450, loss[loss=0.2178, simple_loss=0.2785, pruned_loss=0.07852, over 23262.00 frames. ], tot_loss[loss=0.213, simple_loss=0.281, pruned_loss=0.07252, over 4244621.33 frames. ], batch size: 119, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:03:25,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:03:25,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:03:25,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:03:29,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 07:03:29,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:03:31,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:03:31,893 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-09-29 07:03:32,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:03:32,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 07:03:32,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:03:34,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:03:36,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:03:36,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=286306.6666666667, ans=0.025 2023-09-29 07:03:44,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:46,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:03:47,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 07:03:49,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 07:03:51,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:03:53,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=286373.3333333333, ans=0.04949747468305833 2023-09-29 07:03:54,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:56,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:03:59,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:03:59,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:04:00,165 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=286440.0, ans=0.125 2023-09-29 07:04:00,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=286440.0, ans=0.2 2023-09-29 07:04:02,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 07:04:02,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 07:04:05,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 07:04:05,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:08,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:08,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:04:10,814 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 07:04:10,828 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 07:04:12,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:04:13,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:04:15,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:04:20,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:04:20,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:04:21,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:04:23,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 07:04:26,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:29,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:04:29,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:04:31,301 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.979e+02 2.168e+02 2.458e+02 3.361e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 07:04:31,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 07:04:33,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=286573.3333333333, ans=0.1 2023-09-29 07:04:36,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:04:36,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 07:04:38,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 07:04:40,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:43,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:04:46,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:04:48,024 INFO [train.py:1039] (0/4) Epoch 9, batch 500, loss[loss=0.2127, simple_loss=0.2907, pruned_loss=0.06736, over 24683.00 frames. ], tot_loss[loss=0.2141, simple_loss=0.2819, pruned_loss=0.07316, over 4361288.29 frames. ], batch size: 68, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:04:48,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:04:48,207 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 07:04:51,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:53,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:04:53,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:53,363 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 07:04:55,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 07:04:55,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:59,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:05:01,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=286640.0, ans=0.1 2023-09-29 07:05:03,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:05:04,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:05:05,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=286706.6666666667, ans=0.0 2023-09-29 07:05:07,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:05:07,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:05:07,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:08,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=286706.6666666667, ans=0.0 2023-09-29 07:05:17,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:19,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:05:19,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:05:20,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:20,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 07:05:20,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:05:24,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:05:26,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:05:26,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:05:26,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:28,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 07:05:28,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=286773.3333333333, ans=0.2 2023-09-29 07:05:31,370 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 07:05:34,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:34,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:36,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:36,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=286840.0, ans=0.05 2023-09-29 07:05:37,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:37,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:05:40,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 07:05:42,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:05:44,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:46,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:05:49,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:54,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:58,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 07:05:58,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:58,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:06:01,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 07:06:01,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:06:04,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:09,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 07:06:10,906 INFO [train.py:1039] (0/4) Epoch 9, batch 550, loss[loss=0.1984, simple_loss=0.2738, pruned_loss=0.06156, over 24502.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.2827, pruned_loss=0.07375, over 4445112.90 frames. ], batch size: 63, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:06:12,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 07:06:12,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:13,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 07:06:15,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:06:15,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:17,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:06:17,603 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=286973.3333333333, ans=0.2 2023-09-29 07:06:18,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:06:20,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:21,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 07:06:21,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:06:27,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:27,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:31,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:06:31,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:33,859 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=287040.0, ans=0.125 2023-09-29 07:06:35,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=287040.0, ans=0.0 2023-09-29 07:06:36,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 07:06:38,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 07:06:39,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:06:43,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:06:44,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:46,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:06:48,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:48,668 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 07:06:51,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:53,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:06:57,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:57,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:06:57,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:06:59,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:00,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=287173.3333333333, ans=0.0 2023-09-29 07:07:01,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 07:07:02,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 07:07:03,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:03,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:07:04,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:07:04,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:07:06,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:07:10,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:07:11,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:07:11,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:13,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 07:07:14,829 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.039e+02 2.212e+02 2.496e+02 3.392e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 07:07:14,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:07:17,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:17,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:07:18,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:20,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:07:20,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:07:28,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 07:07:31,102 INFO [train.py:1039] (0/4) Epoch 9, batch 600, loss[loss=0.1843, simple_loss=0.2609, pruned_loss=0.05382, over 24338.00 frames. ], tot_loss[loss=0.2154, simple_loss=0.2833, pruned_loss=0.07374, over 4522231.27 frames. ], batch size: 61, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:07:31,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 07:07:31,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=287306.6666666667, ans=0.2 2023-09-29 07:07:34,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:07:34,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:07:34,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:34,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=287306.6666666667, ans=0.1 2023-09-29 07:07:41,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:07:42,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:07:45,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 07:07:47,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:07:50,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:07:52,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:54,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 07:07:54,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:08:01,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 07:08:05,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:08:05,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:07,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:08:11,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:08:11,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:08:13,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:20,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=287506.6666666667, ans=0.125 2023-09-29 07:08:22,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:08:25,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:25,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:08:25,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:33,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 07:08:38,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:08:38,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:08:42,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 07:08:44,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:08:46,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 07:08:46,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:08:46,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:08:51,884 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=287640.0, ans=0.0 2023-09-29 07:08:53,006 INFO [train.py:1039] (0/4) Epoch 9, batch 650, loss[loss=0.2131, simple_loss=0.2938, pruned_loss=0.06622, over 24318.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.283, pruned_loss=0.07357, over 4576524.96 frames. ], batch size: 74, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:08:53,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:08:53,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=287640.0, ans=0.0 2023-09-29 07:08:55,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:08:56,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=287640.0, ans=0.125 2023-09-29 07:08:58,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:08:58,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:09:00,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:04,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 07:09:05,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:09:11,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:09:11,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:14,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:19,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 07:09:20,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:21,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:24,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:09:25,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=287773.3333333333, ans=0.0 2023-09-29 07:09:26,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:09:29,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:29,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:31,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:09:32,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:34,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:09:35,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:09:36,000 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 07:09:36,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:36,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:39,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:42,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:42,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:09:42,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:09:44,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 07:09:45,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:09:45,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:09:47,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:09:47,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:47,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:09:49,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 07:09:50,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 07:09:50,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:50,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:50,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:09:50,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:53,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:58,844 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.023e+02 2.249e+02 2.557e+02 3.525e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 07:10:01,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:01,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:03,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:10:06,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:07,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:10:07,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:08,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=287906.6666666667, ans=0.125 2023-09-29 07:10:15,900 INFO [train.py:1039] (0/4) Epoch 9, batch 700, loss[loss=0.2128, simple_loss=0.2888, pruned_loss=0.06843, over 24352.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2797, pruned_loss=0.07231, over 4587815.42 frames. ], batch size: 77, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:10:15,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:10:15,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:16,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:16,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:21,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 07:10:22,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 07:10:25,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 07:10:25,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:28,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:10:30,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 07:10:32,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=288040.0, ans=0.2 2023-09-29 07:10:33,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:37,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:10:39,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:39,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:10:41,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:44,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:47,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:10:47,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:10:49,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 07:10:52,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 07:10:52,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=288106.6666666667, ans=0.125 2023-09-29 07:10:55,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:10:57,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:10:58,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:11:03,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:11:05,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 07:11:07,397 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.35 vs. limit=15.0 2023-09-29 07:11:10,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:10,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:11:10,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 07:11:15,103 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.60 vs. limit=22.5 2023-09-29 07:11:15,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:11:15,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:19,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:11:19,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=288173.3333333333, ans=0.05 2023-09-29 07:11:25,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:11:25,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 07:11:29,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 07:11:29,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 07:11:31,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:33,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:34,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:11:36,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:36,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 07:11:38,248 INFO [train.py:1039] (0/4) Epoch 9, batch 750, loss[loss=0.2102, simple_loss=0.2883, pruned_loss=0.06602, over 24653.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.279, pruned_loss=0.0724, over 4602628.53 frames. ], batch size: 68, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:11:41,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 07:11:41,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 07:11:41,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 07:11:43,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 07:11:43,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 07:11:45,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:11:46,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 07:11:48,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:48,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:11:51,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:11:52,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:52,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:11:52,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:55,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:11:57,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:11:58,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:12:00,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:02,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:02,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 07:12:03,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:12:03,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:04,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=288373.3333333333, ans=0.1 2023-09-29 07:12:05,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:09,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:12:09,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 07:12:09,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:12:12,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 07:12:12,476 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 07:12:14,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 07:12:14,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:12:14,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:12:16,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:12:24,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:12:24,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:24,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:12:27,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:29,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:12:29,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 07:12:29,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:12:30,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 07:12:32,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:12:34,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=288506.6666666667, ans=0.0 2023-09-29 07:12:36,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:12:36,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 07:12:38,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:42,018 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.85 vs. limit=10.0 2023-09-29 07:12:44,786 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.964e+02 2.224e+02 2.470e+02 4.454e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 07:12:44,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:12:45,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:12:46,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:48,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:12:52,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 07:12:53,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:12:53,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:12:58,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=288573.3333333333, ans=0.1 2023-09-29 07:13:01,764 INFO [train.py:1039] (0/4) Epoch 9, batch 800, loss[loss=0.2029, simple_loss=0.2864, pruned_loss=0.05972, over 24636.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2801, pruned_loss=0.07279, over 4621740.79 frames. ], batch size: 68, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:13:01,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:01,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:13:11,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:11,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:12,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:13:12,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:14,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:14,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:15,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:18,130 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:13:19,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:21,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:13:24,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 07:13:24,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:26,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:26,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:13:26,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:26,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 07:13:28,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:28,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 07:13:31,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:35,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:38,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:13:38,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:39,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:39,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:41,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=288773.3333333333, ans=0.0 2023-09-29 07:13:44,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:13:44,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:13:45,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 07:13:47,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 07:13:47,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 07:13:47,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:13:47,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:50,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:50,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:13:51,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=288840.0, ans=0.125 2023-09-29 07:13:56,033 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 07:13:56,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 07:13:57,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:13:59,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:14:00,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=288840.0, ans=0.0 2023-09-29 07:14:01,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:14:07,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:07,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 07:14:08,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:14:10,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=288906.6666666667, ans=0.2 2023-09-29 07:14:13,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 07:14:21,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:22,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:14:24,348 INFO [train.py:1039] (0/4) Epoch 9, batch 850, loss[loss=0.2229, simple_loss=0.3023, pruned_loss=0.07175, over 24308.00 frames. ], tot_loss[loss=0.2134, simple_loss=0.281, pruned_loss=0.07283, over 4650321.81 frames. ], batch size: 74, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:14:24,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 07:14:24,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:14:24,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:26,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 07:14:27,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:29,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:14:32,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:34,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:14:35,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:14:37,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 07:14:37,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 07:14:37,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 07:14:39,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:39,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:14:41,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:43,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:43,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:14:48,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:48,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:48,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 07:14:51,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 07:14:56,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:56,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 07:14:59,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 07:15:00,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 07:15:02,476 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 07:15:04,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:04,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:15:04,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:15:05,552 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.18 vs. limit=22.5 2023-09-29 07:15:06,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:06,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=289106.6666666667, ans=0.1 2023-09-29 07:15:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:09,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 07:15:10,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:12,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:12,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:15:12,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:15:14,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:15:16,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:15:18,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 07:15:22,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:15:22,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:24,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:15:24,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:25,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:28,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:30,142 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 2.042e+02 2.331e+02 2.770e+02 4.715e+02, threshold=4.662e+02, percent-clipped=1.0 2023-09-29 07:15:30,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:15:31,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:15:31,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:33,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:15:34,478 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.20 vs. limit=22.5 2023-09-29 07:15:40,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:15:41,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:42,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 07:15:42,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:15:42,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:45,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 07:15:47,721 INFO [train.py:1039] (0/4) Epoch 9, batch 900, loss[loss=0.253, simple_loss=0.3101, pruned_loss=0.09795, over 22712.00 frames. ], tot_loss[loss=0.2152, simple_loss=0.2823, pruned_loss=0.07399, over 4656245.54 frames. ], batch size: 322, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:15:50,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=289306.6666666667, ans=0.125 2023-09-29 07:15:53,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:15:57,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:57,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 07:16:00,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:16:02,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 07:16:03,217 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.50 vs. limit=10.0 2023-09-29 07:16:03,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:16:04,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:16:04,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:04,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:16:05,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:16:16,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:17,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:16:17,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:16:17,583 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.01 vs. limit=15.0 2023-09-29 07:16:20,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:21,229 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=289440.0, ans=0.0 2023-09-29 07:16:25,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 07:16:27,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:16:32,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:16:32,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:16:32,445 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 07:16:33,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 07:16:41,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:16:41,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:16:41,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:16:46,847 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.45 vs. limit=10.0 2023-09-29 07:16:49,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:49,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:16:51,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 07:16:51,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:55,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 07:16:57,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:16:57,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:59,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:00,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:02,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 07:17:02,780 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 07:17:05,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:17:05,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 07:17:07,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:07,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=289573.3333333333, ans=0.125 2023-09-29 07:17:07,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=289573.3333333333, ans=0.125 2023-09-29 07:17:10,228 INFO [train.py:1039] (0/4) Epoch 9, batch 950, loss[loss=0.2202, simple_loss=0.2737, pruned_loss=0.08337, over 23815.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2831, pruned_loss=0.0746, over 4670479.60 frames. ], batch size: 212, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:17:11,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 07:17:16,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:20,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:17:24,788 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 07:17:28,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:29,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:31,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:31,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:17:32,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 07:17:33,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:17:35,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:36,895 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.26 vs. limit=15.0 2023-09-29 07:17:37,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 07:17:37,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:41,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:41,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:41,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:43,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 07:17:43,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 07:17:45,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:46,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:17:52,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:52,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:56,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 07:17:58,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:17:58,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:18:00,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:00,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:00,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:18:07,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 07:18:07,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:18:10,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:12,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:12,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 07:18:12,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:12,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:18:13,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 07:18:16,766 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 1.953e+02 2.303e+02 2.846e+02 4.844e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 07:18:16,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:18:20,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:20,955 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.84 vs. limit=15.0 2023-09-29 07:18:24,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:26,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 07:18:26,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 07:18:29,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=289906.6666666667, ans=0.125 2023-09-29 07:18:30,100 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.69 vs. limit=6.0 2023-09-29 07:18:31,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:32,686 INFO [train.py:1039] (0/4) Epoch 9, batch 1000, loss[loss=0.2329, simple_loss=0.3061, pruned_loss=0.07985, over 23980.00 frames. ], tot_loss[loss=0.2155, simple_loss=0.2821, pruned_loss=0.07446, over 4684712.37 frames. ], batch size: 80, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:18:36,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 07:18:37,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:18:38,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=289973.3333333333, ans=0.125 2023-09-29 07:18:39,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=289973.3333333333, ans=0.2 2023-09-29 07:18:43,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:18:45,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 07:18:45,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 07:18:49,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:18:49,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:51,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:54,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 07:18:57,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 07:18:57,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 07:18:59,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:18:59,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=290040.0, ans=0.1 2023-09-29 07:19:00,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 07:19:02,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 07:19:02,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 07:19:02,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:04,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:15,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:16,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:19:16,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:18,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:18,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 07:19:18,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:20,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:19:20,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:21,835 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 07:19:24,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 07:19:25,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 07:19:28,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 07:19:29,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:19:34,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:34,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:19:34,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:34,732 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=290173.3333333333, ans=0.0 2023-09-29 07:19:38,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:19:39,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 07:19:41,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=290240.0, ans=0.125 2023-09-29 07:19:43,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:19:43,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 07:19:43,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 07:19:46,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:19:46,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:48,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:19:51,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:19:53,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:56,374 INFO [train.py:1039] (0/4) Epoch 9, batch 1050, loss[loss=0.2047, simple_loss=0.2812, pruned_loss=0.06408, over 24561.00 frames. ], tot_loss[loss=0.2143, simple_loss=0.281, pruned_loss=0.07379, over 4683705.64 frames. ], batch size: 71, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:19:56,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:19:58,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:20:01,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:20:01,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:02,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:05,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:20:05,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:20:07,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=290306.6666666667, ans=0.09899494936611666 2023-09-29 07:20:09,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:20:10,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:20:10,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:20:12,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:20:13,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 07:20:14,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:14,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 07:20:17,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:20:17,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 07:20:17,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:20:24,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:24,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:20:26,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:27,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 07:20:27,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 07:20:29,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:32,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 07:20:34,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=290440.0, ans=0.125 2023-09-29 07:20:36,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 07:20:36,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:20:41,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:20:43,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:20:43,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:20:43,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:20:48,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:20:53,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 07:20:54,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 07:20:56,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 07:20:56,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:20:56,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:20:58,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 07:21:02,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:21:04,054 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.054e+02 2.289e+02 2.734e+02 4.286e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 07:21:04,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:21:04,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:05,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:05,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:07,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=290573.3333333333, ans=0.125 2023-09-29 07:21:10,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:10,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 07:21:12,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:12,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 07:21:12,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 07:21:12,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=290573.3333333333, ans=0.125 2023-09-29 07:21:13,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:21:16,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:21:18,272 INFO [train.py:1039] (0/4) Epoch 9, batch 1100, loss[loss=0.2161, simple_loss=0.2962, pruned_loss=0.06806, over 24363.00 frames. ], tot_loss[loss=0.2136, simple_loss=0.2809, pruned_loss=0.07322, over 4701663.95 frames. ], batch size: 74, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:21:23,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:21:29,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:21:32,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:21:32,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:33,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 07:21:33,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:21:34,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=290706.6666666667, ans=0.125 2023-09-29 07:21:36,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:21:40,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:21:42,277 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=290706.6666666667, ans=0.125 2023-09-29 07:21:43,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:21:43,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 07:21:45,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:21:45,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:45,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:48,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:21:48,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=290706.6666666667, ans=0.125 2023-09-29 07:21:50,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:21:51,755 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:21:54,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:21:58,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 07:21:59,910 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 07:21:59,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:00,258 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:22:03,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:05,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:22:05,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=290773.3333333333, ans=0.125 2023-09-29 07:22:06,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:22:07,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 07:22:08,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:22:08,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:22:08,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:22:10,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:10,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 07:22:17,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:22:17,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 07:22:19,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.whiten.whitening_limit, batch_count=290840.0, ans=12.0 2023-09-29 07:22:19,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:22:24,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:22:27,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 07:22:27,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:22:29,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:32,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:32,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:34,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 07:22:35,346 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.96 vs. limit=15.0 2023-09-29 07:22:35,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:22:36,473 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.79 vs. limit=12.0 2023-09-29 07:22:37,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:37,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 07:22:39,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:22:39,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 07:22:41,395 INFO [train.py:1039] (0/4) Epoch 9, batch 1150, loss[loss=0.2354, simple_loss=0.2937, pruned_loss=0.08853, over 23891.00 frames. ], tot_loss[loss=0.2149, simple_loss=0.2822, pruned_loss=0.07376, over 4703662.54 frames. ], batch size: 195, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:22:41,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:22:41,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:22:41,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:22:48,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:48,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=290973.3333333333, ans=0.0 2023-09-29 07:22:49,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:22:52,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:52,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:22:52,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 07:22:52,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:22:56,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 07:22:56,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:57,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:23:03,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 07:23:05,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:10,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:23:10,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:10,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 07:23:10,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:23:10,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:23:18,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 07:23:18,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:20,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:23:21,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=291106.6666666667, ans=0.125 2023-09-29 07:23:29,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=291173.3333333333, ans=0.125 2023-09-29 07:23:30,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:36,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:38,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 07:23:38,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:38,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:45,430 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 07:23:47,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:48,990 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.090e+02 2.381e+02 2.869e+02 4.983e+02, threshold=4.763e+02, percent-clipped=2.0 2023-09-29 07:23:54,711 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 07:23:56,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=291240.0, ans=0.125 2023-09-29 07:23:57,237 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.55 vs. limit=6.0 2023-09-29 07:23:57,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:23:59,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:23:59,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:24:00,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:24:02,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:03,994 INFO [train.py:1039] (0/4) Epoch 9, batch 1200, loss[loss=0.2122, simple_loss=0.2882, pruned_loss=0.06807, over 23972.00 frames. ], tot_loss[loss=0.2152, simple_loss=0.2828, pruned_loss=0.07379, over 4718617.34 frames. ], batch size: 86, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:24:07,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:24:07,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:24:10,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:10,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:10,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:24:10,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=291306.6666666667, ans=0.125 2023-09-29 07:24:11,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:24:15,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:24:15,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:16,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:20,111 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 07:24:24,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 07:24:26,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:24:29,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:24:31,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:32,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:24:32,815 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 07:24:34,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:42,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=291440.0, ans=0.125 2023-09-29 07:24:43,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:24:43,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:24:43,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 07:24:43,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=291440.0, ans=0.125 2023-09-29 07:24:44,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:24:50,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 07:24:53,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 07:24:54,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:54,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:58,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:24:58,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:25:00,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:25:00,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:25:02,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:25:02,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 07:25:02,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:25:02,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:02,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:25:05,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:05,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:25:10,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:25:13,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:25:16,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=291573.3333333333, ans=0.1 2023-09-29 07:25:17,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 07:25:18,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=291573.3333333333, ans=0.2 2023-09-29 07:25:20,916 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 07:25:22,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:26,016 INFO [train.py:1039] (0/4) Epoch 9, batch 1250, loss[loss=0.2039, simple_loss=0.2824, pruned_loss=0.06266, over 24466.00 frames. ], tot_loss[loss=0.2165, simple_loss=0.2835, pruned_loss=0.07479, over 4708668.88 frames. ], batch size: 63, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:25:26,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:26,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=291640.0, ans=0.125 2023-09-29 07:25:27,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:25:29,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:30,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=291640.0, ans=0.125 2023-09-29 07:25:32,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 07:25:37,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:25:38,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:39,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 07:25:41,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:25:42,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:25:47,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:25:47,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:25:49,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:50,642 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.14 vs. limit=15.0 2023-09-29 07:25:51,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:25:56,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:25:56,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:25:56,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:57,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:57,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:02,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=291773.3333333333, ans=0.125 2023-09-29 07:26:03,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:03,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:26:10,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 07:26:10,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:26:13,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:14,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 07:26:15,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:26:15,552 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 07:26:15,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:15,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:20,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:20,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:21,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:26:24,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 07:26:24,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 07:26:24,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 07:26:27,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:28,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=291840.0, ans=0.1 2023-09-29 07:26:29,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 07:26:29,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:31,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:26:31,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:26:33,088 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.977e+02 2.179e+02 2.388e+02 3.416e+02, threshold=4.359e+02, percent-clipped=0.0 2023-09-29 07:26:33,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 07:26:33,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:26:34,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:26:34,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:26:34,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:36,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 07:26:39,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:42,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:26:44,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:26:45,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:26:48,061 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.03 vs. limit=15.0 2023-09-29 07:26:48,704 INFO [train.py:1039] (0/4) Epoch 9, batch 1300, loss[loss=0.1872, simple_loss=0.2633, pruned_loss=0.05549, over 24636.00 frames. ], tot_loss[loss=0.217, simple_loss=0.2842, pruned_loss=0.07484, over 4718110.83 frames. ], batch size: 65, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:26:48,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:48,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 07:26:53,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:55,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:26:56,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:26:58,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:59,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:26:59,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 07:27:05,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:27:06,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:27:08,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 07:27:13,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:27:17,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:17,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:19,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:27:22,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:23,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:27:23,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:27:23,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 07:27:31,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:27:31,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:27:32,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 07:27:34,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:27:35,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:27:37,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:27:38,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 07:27:41,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:41,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 07:27:41,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:44,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:44,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:27:48,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 07:27:49,077 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.93 vs. limit=6.0 2023-09-29 07:27:49,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 07:27:51,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 07:27:56,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:27:59,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 07:28:01,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:01,356 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=292240.0, ans=0.125 2023-09-29 07:28:08,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=292306.6666666667, ans=0.125 2023-09-29 07:28:10,011 INFO [train.py:1039] (0/4) Epoch 9, batch 1350, loss[loss=0.2031, simple_loss=0.284, pruned_loss=0.06114, over 24323.00 frames. ], tot_loss[loss=0.2152, simple_loss=0.283, pruned_loss=0.07375, over 4727959.76 frames. ], batch size: 74, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:28:10,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 07:28:13,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:16,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:19,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=292306.6666666667, ans=0.95 2023-09-29 07:28:21,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:21,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:25,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:28:25,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:28,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=292373.3333333333, ans=0.125 2023-09-29 07:28:29,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:31,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 07:28:33,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:28:33,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:28:36,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 07:28:37,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:28:39,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:28:39,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 07:28:41,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 07:28:41,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=292440.0, ans=0.125 2023-09-29 07:28:42,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 07:28:44,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:44,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 07:28:56,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:05,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:05,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:06,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 07:29:10,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:10,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 07:29:11,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:29:13,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:29:14,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:29:18,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 07:29:20,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:29:21,322 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.985e+02 2.227e+02 2.562e+02 4.004e+02, threshold=4.454e+02, percent-clipped=0.0 2023-09-29 07:29:25,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 07:29:27,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 07:29:33,094 INFO [train.py:1039] (0/4) Epoch 9, batch 1400, loss[loss=0.2103, simple_loss=0.2888, pruned_loss=0.06596, over 24515.00 frames. ], tot_loss[loss=0.2136, simple_loss=0.2813, pruned_loss=0.07295, over 4728713.11 frames. ], batch size: 66, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:29:34,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 07:29:36,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:39,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:29:39,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:29:42,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=292640.0, ans=0.125 2023-09-29 07:29:46,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 07:29:46,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 07:29:47,090 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.08 vs. limit=15.0 2023-09-29 07:29:55,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=292706.6666666667, ans=0.1 2023-09-29 07:29:56,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:29:58,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:00,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:30:01,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:30:04,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:30:07,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:30:15,669 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=292773.3333333333, ans=0.025 2023-09-29 07:30:15,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=292773.3333333333, ans=0.125 2023-09-29 07:30:16,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:18,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:21,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 07:30:22,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:30:23,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:30:24,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:30:24,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:24,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=292840.0, ans=0.0 2023-09-29 07:30:26,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:30:26,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:30:26,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:30:28,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 07:30:28,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:30:32,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:32,804 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.43 vs. limit=15.0 2023-09-29 07:30:35,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:30:40,052 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.98 vs. limit=15.0 2023-09-29 07:30:42,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 07:30:43,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:30:43,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:30:47,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:30:48,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:30:50,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:30:50,281 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=292906.6666666667, ans=0.1 2023-09-29 07:30:53,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:30:56,703 INFO [train.py:1039] (0/4) Epoch 9, batch 1450, loss[loss=0.1927, simple_loss=0.26, pruned_loss=0.06264, over 21741.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2797, pruned_loss=0.07215, over 4722289.06 frames. ], batch size: 47, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:30:57,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=292973.3333333333, ans=0.2 2023-09-29 07:30:58,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:30:58,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:58,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:31:03,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:05,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:31:08,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:31:08,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 07:31:09,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:31:09,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 07:31:11,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:11,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:11,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 07:31:13,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:14,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:31:15,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=293040.0, ans=0.125 2023-09-29 07:31:16,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 07:31:16,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:16,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:31:19,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:22,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:25,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:31:25,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:31:29,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:29,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:32,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:31:32,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:36,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 07:31:39,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:44,383 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 07:31:45,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:31:48,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:31:49,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:31:49,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 07:31:54,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:55,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 07:31:57,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 07:31:57,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:00,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:00,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:32:02,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 07:32:05,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 07:32:05,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 07:32:07,324 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.945e+02 2.236e+02 2.452e+02 4.458e+02, threshold=4.473e+02, percent-clipped=1.0 2023-09-29 07:32:07,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:09,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:32:14,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=293240.0, ans=0.015 2023-09-29 07:32:19,978 INFO [train.py:1039] (0/4) Epoch 9, batch 1500, loss[loss=0.2362, simple_loss=0.2921, pruned_loss=0.09021, over 23727.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.2804, pruned_loss=0.07229, over 4717870.54 frames. ], batch size: 232, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:32:22,029 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=293306.6666666667, ans=0.04949747468305833 2023-09-29 07:32:23,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 07:32:23,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:32:23,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:32:24,947 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-44000.pt 2023-09-29 07:32:28,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:28,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:30,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:32:30,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 07:32:32,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:32:33,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:32:33,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:33,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:36,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:32:38,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:38,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=293373.3333333333, ans=0.0 2023-09-29 07:32:43,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:44,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 07:32:44,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:32:46,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:32:46,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:49,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 07:32:54,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 07:32:57,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:59,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 07:33:03,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:33:06,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:06,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:33:07,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:07,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 07:33:07,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:33:08,482 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.47 vs. limit=15.0 2023-09-29 07:33:09,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:09,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 07:33:10,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:13,647 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.63 vs. limit=12.0 2023-09-29 07:33:14,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:33:14,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 07:33:19,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:33:23,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:33:27,649 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 07:33:27,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:27,762 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 07:33:29,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:31,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:33:31,430 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 07:33:31,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:33:35,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 07:33:38,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:42,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:42,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:44,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:45,488 INFO [train.py:1039] (0/4) Epoch 9, batch 1550, loss[loss=0.1999, simple_loss=0.2678, pruned_loss=0.06605, over 24481.00 frames. ], tot_loss[loss=0.214, simple_loss=0.2817, pruned_loss=0.07318, over 4721611.13 frames. ], batch size: 58, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:33:45,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=293640.0, ans=0.125 2023-09-29 07:33:47,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 07:33:47,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 07:33:48,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:33:48,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 07:33:48,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 07:33:51,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:51,564 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=293640.0, ans=0.125 2023-09-29 07:33:52,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:52,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:33:52,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:33:54,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:54,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:57,962 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 07:33:59,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:59,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:34:00,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:34:02,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:34:02,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 07:34:04,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:34:04,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 07:34:04,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=293706.6666666667, ans=0.0 2023-09-29 07:34:06,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 07:34:06,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 07:34:07,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:08,253 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.62 vs. limit=15.0 2023-09-29 07:34:09,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:12,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:34:15,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 07:34:15,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 07:34:22,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:24,864 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.16 vs. limit=15.0 2023-09-29 07:34:27,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:34:28,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:34:28,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:34:28,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 07:34:33,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:34:35,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:38,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:34:40,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:34:41,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:41,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 07:34:41,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:34:43,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:34:43,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:45,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:34:45,384 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 07:34:48,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:34:51,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=293906.6666666667, ans=0.1 2023-09-29 07:34:55,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 07:34:56,634 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.971e+02 2.191e+02 2.523e+02 4.378e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 07:34:58,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:00,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:01,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 07:35:01,737 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=293906.6666666667, ans=0.125 2023-09-29 07:35:03,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:35:04,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:04,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:35:04,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:35:06,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:35:08,744 INFO [train.py:1039] (0/4) Epoch 9, batch 1600, loss[loss=0.1956, simple_loss=0.2722, pruned_loss=0.0595, over 24487.00 frames. ], tot_loss[loss=0.216, simple_loss=0.283, pruned_loss=0.0745, over 4707101.52 frames. ], batch size: 63, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:35:10,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:12,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 07:35:13,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 07:35:15,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 07:35:15,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=293973.3333333333, ans=0.2 2023-09-29 07:35:16,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:19,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 07:35:21,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:35:23,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:35:28,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:35:32,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 07:35:35,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:35:36,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 07:35:37,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:37,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 07:35:40,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=294106.6666666667, ans=0.1 2023-09-29 07:35:44,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 07:35:44,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=294106.6666666667, ans=0.125 2023-09-29 07:35:44,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=294106.6666666667, ans=0.2 2023-09-29 07:35:49,474 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.53 vs. limit=10.0 2023-09-29 07:35:51,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=294106.6666666667, ans=0.0 2023-09-29 07:35:52,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:52,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 07:35:52,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:53,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:53,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:35:54,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=294106.6666666667, ans=0.1 2023-09-29 07:35:54,237 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=294106.6666666667, ans=0.125 2023-09-29 07:35:57,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 07:36:01,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 07:36:04,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:36:04,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:04,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:06,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:36:08,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:36:10,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:36:11,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:36:17,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:18,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:36:20,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 07:36:20,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:36:22,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 07:36:28,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:30,240 INFO [train.py:1039] (0/4) Epoch 9, batch 1650, loss[loss=0.271, simple_loss=0.3223, pruned_loss=0.1099, over 19780.00 frames. ], tot_loss[loss=0.216, simple_loss=0.2834, pruned_loss=0.07434, over 4701019.94 frames. ], batch size: 388, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:36:31,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:36:31,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:36:31,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 07:36:31,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 07:36:32,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 07:36:33,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 07:36:35,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:36,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:38,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:36:38,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:36:38,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=294306.6666666667, ans=0.125 2023-09-29 07:36:39,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:41,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 07:36:44,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:36:44,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:44,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:36:44,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:36:44,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 07:36:46,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 07:36:52,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:36:55,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:37:07,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 07:37:09,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:10,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 07:37:14,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:15,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:37:17,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:37:17,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:19,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:37:19,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:21,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:23,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:24,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:24,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:26,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:26,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:37:31,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:32,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 07:37:34,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:34,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 07:37:35,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 07:37:35,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 07:37:35,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:37,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:37:37,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:38,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:38,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 07:37:42,534 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.030e+02 2.311e+02 2.647e+02 4.475e+02, threshold=4.622e+02, percent-clipped=1.0 2023-09-29 07:37:42,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:44,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:37:44,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:47,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 07:37:51,799 INFO [train.py:1039] (0/4) Epoch 9, batch 1700, loss[loss=0.2095, simple_loss=0.2646, pruned_loss=0.07724, over 23613.00 frames. ], tot_loss[loss=0.2148, simple_loss=0.2823, pruned_loss=0.07366, over 4710091.92 frames. ], batch size: 256, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:37:51,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:51,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:37:52,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 07:37:53,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:37:53,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:37:53,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:55,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:37:55,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:37:55,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 07:37:59,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:38:09,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:38:12,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:38:18,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=294706.6666666667, ans=0.125 2023-09-29 07:38:19,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:38:19,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:19,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:38:19,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:22,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 07:38:25,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:38:25,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:27,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:38:29,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:38:30,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 07:38:32,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 07:38:34,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:36,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 07:38:38,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:38:39,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=294773.3333333333, ans=0.0 2023-09-29 07:38:47,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:38:47,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:38:47,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=294840.0, ans=0.1 2023-09-29 07:38:49,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:52,114 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.60 vs. limit=15.0 2023-09-29 07:38:52,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:38:52,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 07:38:52,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:54,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:54,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 07:38:55,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:38:55,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:38:55,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:55,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:38:57,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=294906.6666666667, ans=0.125 2023-09-29 07:39:00,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:00,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:39:01,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:01,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:39:03,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:04,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=294906.6666666667, ans=0.0 2023-09-29 07:39:05,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:07,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 07:39:08,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:12,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:12,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 07:39:15,873 INFO [train.py:1039] (0/4) Epoch 9, batch 1750, loss[loss=0.2131, simple_loss=0.2806, pruned_loss=0.07275, over 23350.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.2806, pruned_loss=0.07363, over 4701463.68 frames. ], batch size: 105, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:39:18,331 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.82 vs. limit=10.0 2023-09-29 07:39:19,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:21,094 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=294973.3333333333, ans=0.125 2023-09-29 07:39:22,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:22,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:39:22,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 07:39:23,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:39:27,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:39:28,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:31,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 07:39:34,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:36,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 07:39:36,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:38,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:39:42,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:39:42,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 07:39:44,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=295040.0, ans=0.0 2023-09-29 07:39:45,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:39:45,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 07:39:48,307 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.11 vs. limit=15.0 2023-09-29 07:39:49,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=295106.6666666667, ans=0.125 2023-09-29 07:39:50,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=295106.6666666667, ans=0.1 2023-09-29 07:39:53,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:39:56,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:39:56,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:58,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=295106.6666666667, ans=0.09899494936611666 2023-09-29 07:39:58,603 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=295106.6666666667, ans=0.125 2023-09-29 07:40:00,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=295106.6666666667, ans=0.0 2023-09-29 07:40:01,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:01,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:40:03,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:03,995 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.27 vs. limit=15.0 2023-09-29 07:40:06,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:08,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:09,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:40:09,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 07:40:13,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:15,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 07:40:17,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:18,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:20,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:40:23,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:40:23,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:40:25,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:26,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:27,932 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.058e+02 2.373e+02 2.670e+02 4.900e+02, threshold=4.746e+02, percent-clipped=2.0 2023-09-29 07:40:31,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:34,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:40:35,363 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=14.62 vs. limit=15.0 2023-09-29 07:40:35,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:40:35,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 07:40:35,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:37,711 INFO [train.py:1039] (0/4) Epoch 9, batch 1800, loss[loss=0.2428, simple_loss=0.2894, pruned_loss=0.09813, over 22917.00 frames. ], tot_loss[loss=0.2127, simple_loss=0.2793, pruned_loss=0.07307, over 4694729.59 frames. ], batch size: 322, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:40:37,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:40:37,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:37,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:40:37,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:40:38,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=295306.6666666667, ans=0.0 2023-09-29 07:40:39,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:40:41,752 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.41 vs. limit=22.5 2023-09-29 07:40:42,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:40:44,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:46,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:40:47,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:50,062 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.16 vs. limit=15.0 2023-09-29 07:40:50,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:40:53,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:56,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:40:59,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:59,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:59,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=295373.3333333333, ans=0.0 2023-09-29 07:41:00,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:41:02,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:41:02,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 07:41:02,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:05,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:10,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 07:41:12,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 07:41:12,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 07:41:12,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:15,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:41:15,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:41:17,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:41:24,698 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 07:41:26,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:41:28,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:30,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 07:41:30,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 07:41:30,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:41:31,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:41:31,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:41:33,619 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=295506.6666666667, ans=0.125 2023-09-29 07:41:33,626 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=295506.6666666667, ans=0.125 2023-09-29 07:41:37,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 07:41:39,864 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=295506.6666666667, ans=0.1 2023-09-29 07:41:44,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:41:44,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 07:41:46,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:41:46,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:47,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:41:47,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 07:41:50,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:41:50,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:41:53,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 07:41:53,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:56,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:41:56,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:41:56,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:58,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:42:00,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:42:02,004 INFO [train.py:1039] (0/4) Epoch 9, batch 1850, loss[loss=0.2048, simple_loss=0.274, pruned_loss=0.0678, over 24628.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.2801, pruned_loss=0.07346, over 4705110.79 frames. ], batch size: 60, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:42:02,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:42:02,944 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=12.0 2023-09-29 07:42:03,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:03,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=295640.0, ans=0.125 2023-09-29 07:42:06,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:42:08,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:14,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:42:16,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 07:42:20,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 07:42:23,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 07:42:28,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:42:28,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 07:42:28,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:42:39,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:42:41,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 07:42:41,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=295773.3333333333, ans=0.125 2023-09-29 07:42:44,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:42:44,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:42:48,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=295773.3333333333, ans=0.0 2023-09-29 07:42:49,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 07:42:49,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:49,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:42:50,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:42:52,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:55,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:57,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:42:57,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:59,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:42:59,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:01,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:03,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:43:06,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 07:43:07,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:13,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:43:13,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:43:13,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 07:43:13,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 07:43:14,715 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.027e+02 2.265e+02 2.527e+02 4.357e+02, threshold=4.531e+02, percent-clipped=0.0 2023-09-29 07:43:15,024 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 07:43:16,568 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 07:43:18,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:43:19,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:43:19,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:19,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:19,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 07:43:19,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:43:21,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:21,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:43:22,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:43:24,109 INFO [train.py:1039] (0/4) Epoch 9, batch 1900, loss[loss=0.2084, simple_loss=0.2705, pruned_loss=0.07311, over 23395.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.2816, pruned_loss=0.0739, over 4714149.04 frames. ], batch size: 134, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:43:24,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:43:24,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 07:43:25,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:25,905 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 07:43:25,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:43:26,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=295973.3333333333, ans=0.0 2023-09-29 07:43:27,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:31,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=295973.3333333333, ans=0.125 2023-09-29 07:43:32,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:34,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=295973.3333333333, ans=0.0 2023-09-29 07:43:35,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:43:37,522 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 07:43:37,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 07:43:39,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:40,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:43:40,764 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 07:43:40,806 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 07:43:45,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 07:43:46,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:43:50,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 07:43:53,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 07:44:05,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 07:44:05,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=296106.6666666667, ans=0.0 2023-09-29 07:44:08,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 07:44:08,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:09,721 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 07:44:09,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 07:44:09,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 07:44:11,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 07:44:11,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:44:15,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 07:44:20,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:44:21,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:21,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 07:44:23,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:44:25,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=296173.3333333333, ans=0.125 2023-09-29 07:44:26,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 07:44:27,061 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=296173.3333333333, ans=0.0 2023-09-29 07:44:28,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:28,572 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:44:34,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:44:34,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:44:34,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:44:35,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:44:37,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:44:37,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:44:39,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=296240.0, ans=0.09899494936611666 2023-09-29 07:44:41,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:44:44,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:44,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:44:46,162 INFO [train.py:1039] (0/4) Epoch 9, batch 1950, loss[loss=0.2234, simple_loss=0.3042, pruned_loss=0.07127, over 24347.00 frames. ], tot_loss[loss=0.2156, simple_loss=0.2824, pruned_loss=0.07437, over 4706468.89 frames. ], batch size: 74, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:44:46,537 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:44:47,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:44:47,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:47,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:49,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:50,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:44:55,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:44:56,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:56,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:44:58,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 07:44:58,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:44:58,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:00,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:04,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:45:04,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:04,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:06,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:09,842 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.17 vs. limit=15.0 2023-09-29 07:45:10,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:45:10,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:45:10,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:45:10,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:15,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:19,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:45:19,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:19,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:45:19,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 07:45:19,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:45:19,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:45:21,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:24,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:25,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:45:30,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:45:33,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:45:33,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:45:33,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=296440.0, ans=0.0 2023-09-29 07:45:34,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 07:45:34,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:45:39,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:39,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:45:39,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=296506.6666666667, ans=0.0 2023-09-29 07:45:40,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:45:49,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:50,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:53,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:55,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:59,267 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.955e+02 2.207e+02 2.649e+02 3.533e+02, threshold=4.414e+02, percent-clipped=0.0 2023-09-29 07:45:59,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:45:59,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:46:00,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 07:46:00,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:46:03,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:46:04,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 07:46:06,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:09,269 INFO [train.py:1039] (0/4) Epoch 9, batch 2000, loss[loss=0.2269, simple_loss=0.295, pruned_loss=0.07942, over 23240.00 frames. ], tot_loss[loss=0.2152, simple_loss=0.2824, pruned_loss=0.07401, over 4720503.51 frames. ], batch size: 93, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:46:09,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:46:10,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:46:10,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:46:11,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:46:12,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:46:15,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=296640.0, ans=0.125 2023-09-29 07:46:17,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 07:46:17,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:46:23,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:46:25,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 07:46:26,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:46:26,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:30,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:46:31,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 07:46:35,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:39,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 07:46:39,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:46:42,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 07:46:42,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:45,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:46:46,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:46:46,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:48,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:46:49,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:46:49,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 07:46:53,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 07:46:53,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:53,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:01,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:02,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:47:02,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:03,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:47:04,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:04,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:05,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=296840.0, ans=0.125 2023-09-29 07:47:06,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:06,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:07,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:10,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:47:12,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 07:47:15,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:47:16,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:47:24,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:24,773 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=296906.6666666667, ans=0.2 2023-09-29 07:47:25,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:26,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:27,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:47:27,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:47:29,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:31,099 INFO [train.py:1039] (0/4) Epoch 9, batch 2050, loss[loss=0.2097, simple_loss=0.2754, pruned_loss=0.07197, over 23539.00 frames. ], tot_loss[loss=0.2142, simple_loss=0.281, pruned_loss=0.07372, over 4704699.56 frames. ], batch size: 120, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:47:31,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:34,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:34,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:40,342 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.19 vs. limit=6.0 2023-09-29 07:47:41,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:46,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:47:46,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:48,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:47:49,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 07:47:49,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:47:50,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:51,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:48:00,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:00,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:01,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=297040.0, ans=0.1 2023-09-29 07:48:01,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=297040.0, ans=0.125 2023-09-29 07:48:03,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 07:48:06,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:07,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 07:48:07,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:10,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:11,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:13,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:48:13,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:13,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=297106.6666666667, ans=0.05 2023-09-29 07:48:14,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:48:16,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:48:16,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:48:21,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:24,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:48:25,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:48:26,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:48:26,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=297173.3333333333, ans=0.2 2023-09-29 07:48:26,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=297173.3333333333, ans=0.0 2023-09-29 07:48:30,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=297173.3333333333, ans=0.125 2023-09-29 07:48:32,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:48:33,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=297173.3333333333, ans=0.05 2023-09-29 07:48:36,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:48:36,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 07:48:42,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:44,025 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 2.106e+02 2.256e+02 2.757e+02 3.895e+02, threshold=4.512e+02, percent-clipped=0.0 2023-09-29 07:48:44,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:48:45,930 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=297240.0, ans=0.0 2023-09-29 07:48:47,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:48:48,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 07:48:52,603 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 07:48:52,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:48:54,337 INFO [train.py:1039] (0/4) Epoch 9, batch 2100, loss[loss=0.2089, simple_loss=0.2875, pruned_loss=0.06516, over 24448.00 frames. ], tot_loss[loss=0.2132, simple_loss=0.2792, pruned_loss=0.07358, over 4689630.00 frames. ], batch size: 69, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:48:54,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:54,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:48:56,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:56,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 07:48:57,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 07:48:59,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:49:02,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:49:02,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:49:05,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:06,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:49:06,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 07:49:06,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:49:08,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 07:49:08,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 07:49:08,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=297373.3333333333, ans=0.0 2023-09-29 07:49:09,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:09,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:10,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 07:49:11,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 07:49:18,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 07:49:18,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:49:23,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:49:23,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:49:28,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:49:29,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 07:49:29,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:29,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:49:32,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 07:49:32,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:32,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 07:49:33,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 07:49:33,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 07:49:35,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:49:36,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:49:41,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:41,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:42,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:44,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 07:49:44,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:44,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:46,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 07:49:48,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 07:49:48,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 07:49:53,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:49:56,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:56,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 07:50:03,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:06,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:50:06,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:06,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:06,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:50:08,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:08,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:09,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:50:10,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:50:11,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:13,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 07:50:14,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 07:50:14,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:16,063 INFO [train.py:1039] (0/4) Epoch 9, batch 2150, loss[loss=0.2093, simple_loss=0.2943, pruned_loss=0.06215, over 24301.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2783, pruned_loss=0.07305, over 4690105.51 frames. ], batch size: 74, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:50:19,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:50:19,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:50:19,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:50:19,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:50:25,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:50:26,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:26,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:28,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:50:28,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:28,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:50:33,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:33,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:50:33,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:50:37,316 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.95 vs. limit=15.0 2023-09-29 07:50:38,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:38,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 07:50:44,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:46,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:50:48,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:48,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:50:49,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:49,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:51,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:52,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 07:50:54,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:50:56,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:57,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:57,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:59,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:51:01,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:01,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:51:03,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:03,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 07:51:03,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:51:06,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:06,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:09,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:09,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:51:11,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:12,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:12,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 07:51:15,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 07:51:15,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:51:16,539 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 07:51:17,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:17,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:51:19,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 07:51:19,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:51:19,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 07:51:19,477 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 07:51:19,477 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 07:51:20,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 07:51:22,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:22,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:51:22,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:51:22,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=297906.6666666667, ans=0.1 2023-09-29 07:51:24,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:25,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:51:27,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:27,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:28,430 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.066e+02 2.283e+02 2.527e+02 4.333e+02, threshold=4.566e+02, percent-clipped=0.0 2023-09-29 07:51:36,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=297906.6666666667, ans=0.1 2023-09-29 07:51:37,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:51:37,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 07:51:39,072 INFO [train.py:1039] (0/4) Epoch 9, batch 2200, loss[loss=0.2268, simple_loss=0.3008, pruned_loss=0.07642, over 23997.00 frames. ], tot_loss[loss=0.2127, simple_loss=0.2793, pruned_loss=0.07306, over 4702823.47 frames. ], batch size: 80, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:51:40,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:51:46,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:47,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:51:47,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:49,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:51:52,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:54,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:54,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 07:51:58,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 07:52:00,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:52:01,033 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=7.70 vs. limit=15.0 2023-09-29 07:52:07,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 07:52:10,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:11,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:13,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:52:15,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:52:16,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 07:52:20,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:52:21,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:23,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:52:26,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:52:28,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:28,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=298173.3333333333, ans=0.125 2023-09-29 07:52:28,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=298173.3333333333, ans=0.125 2023-09-29 07:52:30,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:52:31,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:32,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=298173.3333333333, ans=0.125 2023-09-29 07:52:33,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 07:52:34,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:36,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 07:52:38,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:38,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:52:39,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:42,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:42,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:42,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:42,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:44,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:52:44,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:52:46,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=298240.0, ans=0.125 2023-09-29 07:52:47,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:52:50,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=298240.0, ans=0.125 2023-09-29 07:52:53,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:52:53,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:52:56,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:52:57,814 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 07:53:00,815 INFO [train.py:1039] (0/4) Epoch 9, batch 2250, loss[loss=0.2132, simple_loss=0.2768, pruned_loss=0.0748, over 23433.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2792, pruned_loss=0.07245, over 4720012.80 frames. ], batch size: 134, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:53:00,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:53:00,988 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 07:53:03,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:53:03,088 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 07:53:04,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:06,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:53:06,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:06,971 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.04 vs. limit=15.0 2023-09-29 07:53:07,823 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 07:53:09,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:53:12,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:18,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:53:20,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:53:24,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:24,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:24,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=298373.3333333333, ans=0.0 2023-09-29 07:53:25,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:27,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 07:53:27,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:28,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=298373.3333333333, ans=0.125 2023-09-29 07:53:29,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:53:30,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 07:53:32,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:53:32,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:33,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:53:39,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=298440.0, ans=0.125 2023-09-29 07:53:40,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:53:40,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:53:40,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 07:53:42,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:45,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:53:51,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:53,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:54,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:54,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:58,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:54:00,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:54:05,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:54:05,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:54:12,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:54:12,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:54:13,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:54:15,284 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.931e+02 2.128e+02 2.517e+02 4.313e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 07:54:18,161 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.90 vs. limit=22.5 2023-09-29 07:54:18,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:54:20,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:54:20,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 07:54:21,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:21,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:54:23,829 INFO [train.py:1039] (0/4) Epoch 9, batch 2300, loss[loss=0.2441, simple_loss=0.3012, pruned_loss=0.09348, over 23438.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.2803, pruned_loss=0.07292, over 4720454.56 frames. ], batch size: 285, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:54:25,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 07:54:29,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:54:30,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:36,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:37,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:54:39,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=298706.6666666667, ans=0.125 2023-09-29 07:54:40,826 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 07:54:42,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:49,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:54:49,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:54:49,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:54:50,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:50,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 07:54:50,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:54:53,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:54:53,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:54:57,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:55:00,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:55:02,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:07,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:55:07,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:55:10,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:55:14,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:55:17,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:55:19,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:55:19,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:55:19,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 07:55:24,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:55:24,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:24,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:24,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:55:25,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:27,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:55:27,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:55:27,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 07:55:27,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:55:27,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:27,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 07:55:27,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=298840.0, ans=0.125 2023-09-29 07:55:34,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:55:36,926 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=298906.6666666667, ans=0.125 2023-09-29 07:55:38,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=298906.6666666667, ans=0.125 2023-09-29 07:55:40,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:55:44,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:44,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:55:44,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:55:47,469 INFO [train.py:1039] (0/4) Epoch 9, batch 2350, loss[loss=0.182, simple_loss=0.2566, pruned_loss=0.05368, over 24331.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2805, pruned_loss=0.07277, over 4725628.06 frames. ], batch size: 56, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:55:47,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:55:47,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:55:47,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:55:47,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 07:55:55,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:55:55,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 07:56:02,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 07:56:07,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:56:10,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:10,502 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=299040.0, ans=0.125 2023-09-29 07:56:11,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:11,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 07:56:15,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:56:20,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 07:56:22,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:25,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:56:25,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:56:28,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:56:30,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 07:56:30,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:56:32,565 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:56:33,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:33,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:56:33,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:56:37,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:56:40,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 07:56:40,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:56:43,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:43,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:56:45,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 07:56:47,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:56:49,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 07:56:50,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:56:54,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 07:56:58,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 07:57:00,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:57:00,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:57:00,312 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 07:57:00,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 07:57:00,996 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.00 vs. limit=22.5 2023-09-29 07:57:01,784 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.074e+02 2.290e+02 2.554e+02 3.364e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-29 07:57:04,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 07:57:05,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:57:10,763 INFO [train.py:1039] (0/4) Epoch 9, batch 2400, loss[loss=0.2271, simple_loss=0.2805, pruned_loss=0.08685, over 23809.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.2812, pruned_loss=0.07329, over 4718000.91 frames. ], batch size: 212, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:57:10,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:57:13,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:57:16,265 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.76 vs. limit=10.0 2023-09-29 07:57:16,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:57:17,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 07:57:17,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 07:57:20,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=299306.6666666667, ans=0.1 2023-09-29 07:57:26,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:57:26,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:57:28,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 07:57:28,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:57:29,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:31,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 07:57:38,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:38,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 07:57:43,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:57:44,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=299440.0, ans=0.1 2023-09-29 07:57:45,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=299440.0, ans=0.125 2023-09-29 07:57:47,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 07:57:50,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:57:51,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:57,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:57:58,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 07:57:58,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:58:03,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:07,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:12,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:13,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:58:13,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:58:13,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:58:14,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:15,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:15,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:58:18,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:58:19,544 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.39 vs. limit=12.0 2023-09-29 07:58:20,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:58:20,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 07:58:21,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 07:58:23,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:58:24,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:24,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 07:58:26,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 07:58:26,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 07:58:26,200 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 07:58:26,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 07:58:27,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:58:30,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:30,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:31,554 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 07:58:31,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:31,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:58:33,130 INFO [train.py:1039] (0/4) Epoch 9, batch 2450, loss[loss=0.2287, simple_loss=0.2825, pruned_loss=0.08749, over 23759.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.2796, pruned_loss=0.07311, over 4705839.62 frames. ], batch size: 179, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:58:36,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:58:36,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:42,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:42,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:44,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 07:58:50,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:50,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:53,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=299706.6666666667, ans=0.125 2023-09-29 07:58:54,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:58:54,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:58:54,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:58:54,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 07:58:58,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:59,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:59:01,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:59:04,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:59:06,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:06,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:07,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:59:10,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 07:59:12,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:59:13,330 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.01 vs. limit=6.0 2023-09-29 07:59:19,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:21,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:59:21,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:21,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:59:21,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:23,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:59:24,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 07:59:26,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:27,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:59:30,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:59:30,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:36,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:59:36,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 07:59:37,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:59:39,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:59:39,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 07:59:40,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:59:42,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:59:45,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=299906.6666666667, ans=0.125 2023-09-29 07:59:46,461 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 2.171e+02 2.453e+02 2.838e+02 4.289e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-29 07:59:46,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:59:46,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=299906.6666666667, ans=0.0 2023-09-29 07:59:48,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:50,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:59:53,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 07:59:54,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:59:56,043 INFO [train.py:1039] (0/4) Epoch 9, batch 2500, loss[loss=0.2305, simple_loss=0.3041, pruned_loss=0.07846, over 23991.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2794, pruned_loss=0.07254, over 4714253.22 frames. ], batch size: 80, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 08:00:01,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:12,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:00:12,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:00:14,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:14,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 08:00:21,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:00:21,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:00:23,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:00:23,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:00:23,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 08:00:23,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:25,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:25,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 08:00:25,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:28,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 08:00:28,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:30,682 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.73 vs. limit=15.0 2023-09-29 08:00:33,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:00:33,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:36,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:00:38,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 08:00:38,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:00:41,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:44,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:44,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=300173.3333333333, ans=0.1 2023-09-29 08:00:49,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:52,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:00:55,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=300173.3333333333, ans=0.125 2023-09-29 08:00:57,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:01:01,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 08:01:01,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:01,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:04,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:01:04,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:01:04,609 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 08:01:04,610 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 08:01:04,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 08:01:04,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=300240.0, ans=0.125 2023-09-29 08:01:08,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:09,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 08:01:09,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 08:01:11,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:01:11,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 08:01:14,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 08:01:16,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:18,159 INFO [train.py:1039] (0/4) Epoch 9, batch 2550, loss[loss=0.218, simple_loss=0.2832, pruned_loss=0.07638, over 23652.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2799, pruned_loss=0.07283, over 4713157.18 frames. ], batch size: 149, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:01:19,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:01:19,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:01:21,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=300306.6666666667, ans=0.1 2023-09-29 08:01:22,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:22,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 08:01:22,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:01:24,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=300306.6666666667, ans=0.125 2023-09-29 08:01:27,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 08:01:27,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:01:30,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:33,619 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.52 vs. limit=15.0 2023-09-29 08:01:34,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:34,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:01:35,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:01:35,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:01:37,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:40,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:01:40,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 08:01:42,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:42,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:42,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 08:01:56,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:02:02,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:04,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:04,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:02:04,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:02:04,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=300506.6666666667, ans=0.125 2023-09-29 08:02:11,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:02:13,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:02:15,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:02:15,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:02:15,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:02:16,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:02:19,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:19,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:23,597 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.47 vs. limit=15.0 2023-09-29 08:02:24,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:02:26,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 08:02:26,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:02:26,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:26,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:02:29,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:02:29,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:30,790 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.909e+02 2.105e+02 2.404e+02 4.394e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 08:02:35,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:02:37,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:39,453 INFO [train.py:1039] (0/4) Epoch 9, batch 2600, loss[loss=0.2001, simple_loss=0.2725, pruned_loss=0.06387, over 23343.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.2806, pruned_loss=0.07224, over 4722244.08 frames. ], batch size: 119, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:02:39,658 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 08:02:42,754 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 08:02:42,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:02:42,859 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 08:02:42,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 08:02:43,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=300640.0, ans=0.125 2023-09-29 08:02:44,238 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 08:02:46,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:46,614 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 08:02:48,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 08:02:50,084 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 08:02:53,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:02:54,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 08:02:54,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=300706.6666666667, ans=0.0 2023-09-29 08:02:55,464 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.09 vs. limit=15.0 2023-09-29 08:02:56,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 08:02:58,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:02:58,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 08:03:01,291 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 08:03:01,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 08:03:06,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=300706.6666666667, ans=0.0 2023-09-29 08:03:07,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:07,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:07,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:07,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 08:03:07,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=300706.6666666667, ans=0.125 2023-09-29 08:03:09,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:03:09,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=300706.6666666667, ans=0.125 2023-09-29 08:03:17,382 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 08:03:24,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:24,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:26,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 08:03:27,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:27,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:27,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 08:03:30,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:03:30,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:03:34,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:39,133 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 08:03:39,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:40,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:03:45,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:47,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:03:47,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 08:03:47,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:49,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:03:50,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:03:57,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 08:03:58,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:59,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=300906.6666666667, ans=0.125 2023-09-29 08:04:00,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:04:00,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=300973.3333333333, ans=0.07 2023-09-29 08:04:01,777 INFO [train.py:1039] (0/4) Epoch 9, batch 2650, loss[loss=0.1908, simple_loss=0.2621, pruned_loss=0.05976, over 24612.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.281, pruned_loss=0.07278, over 4723882.67 frames. ], batch size: 60, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:04:03,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 08:04:03,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:05,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:04:05,235 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 08:04:05,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:08,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:11,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:04:11,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=300973.3333333333, ans=0.125 2023-09-29 08:04:13,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:04:16,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:04:16,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 08:04:16,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:04:17,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:04:20,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 08:04:23,105 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 08:04:26,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:27,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 08:04:27,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:29,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 08:04:35,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:04:35,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:39,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 08:04:39,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 08:04:43,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:04:45,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=301106.6666666667, ans=0.0 2023-09-29 08:04:47,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 08:04:47,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:49,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:50,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:04:50,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:52,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:53,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:55,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:04:55,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:55,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:04:57,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:04:59,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:59,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:05:00,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:02,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:05:02,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:05:06,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:06,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=301240.0, ans=0.05 2023-09-29 08:05:07,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:05:07,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:07,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 08:05:13,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:05:13,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:15,915 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.153e+02 2.553e+02 3.125e+02 4.988e+02, threshold=5.107e+02, percent-clipped=5.0 2023-09-29 08:05:17,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:17,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:19,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:05:20,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:22,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:22,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 08:05:23,523 INFO [train.py:1039] (0/4) Epoch 9, batch 2700, loss[loss=0.2086, simple_loss=0.289, pruned_loss=0.06411, over 24677.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.2821, pruned_loss=0.07369, over 4720723.37 frames. ], batch size: 68, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:05:26,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:05:28,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:05:30,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:05:30,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:30,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:31,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:05:31,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:31,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:05:31,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:05:31,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 08:05:31,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=301306.6666666667, ans=0.125 2023-09-29 08:05:33,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:05:36,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:05:38,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:05:38,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:45,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:05:45,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 08:05:46,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:05:52,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:05:52,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:05:53,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=301373.3333333333, ans=0.125 2023-09-29 08:05:58,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:05:58,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:58,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:05:58,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:06:02,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:05,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:05,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=301440.0, ans=0.125 2023-09-29 08:06:06,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:06:06,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:06:10,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:10,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:06:15,444 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.91 vs. limit=22.5 2023-09-29 08:06:19,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:06:21,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:06:25,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:06:25,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:27,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:29,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:30,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:32,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:33,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:33,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:06:36,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:06:38,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:38,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:40,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 08:06:42,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:43,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:06:43,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 08:06:45,107 INFO [train.py:1039] (0/4) Epoch 9, batch 2750, loss[loss=0.1953, simple_loss=0.2678, pruned_loss=0.06141, over 24576.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2812, pruned_loss=0.07309, over 4732921.83 frames. ], batch size: 60, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:06:45,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 08:06:46,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:50,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:06:50,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:54,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:54,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:06:54,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:57,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:06:58,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:06:59,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=301640.0, ans=0.2 2023-09-29 08:07:00,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:07:00,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:00,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 08:07:00,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:07:00,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:07:05,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 08:07:07,325 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=301706.6666666667, ans=0.07 2023-09-29 08:07:08,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:07:08,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:08,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=301706.6666666667, ans=0.125 2023-09-29 08:07:10,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:10,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:07:10,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:07:11,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:07:11,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:13,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:14,029 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=301706.6666666667, ans=0.2 2023-09-29 08:07:18,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:07:18,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:07:18,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:07:20,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:21,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:07:28,831 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=301773.3333333333, ans=0.0 2023-09-29 08:07:30,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:31,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:07:33,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:38,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:38,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:07:38,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:07:43,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:07:43,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:43,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 08:07:46,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=301840.0, ans=0.125 2023-09-29 08:07:48,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:50,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 08:07:56,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:07:59,502 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.994e+02 2.310e+02 2.732e+02 5.086e+02, threshold=4.620e+02, percent-clipped=0.0 2023-09-29 08:08:00,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=301906.6666666667, ans=0.2 2023-09-29 08:08:01,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:08:01,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 08:08:03,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:03,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:08:04,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 08:08:04,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:08:06,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:08:06,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:08,080 INFO [train.py:1039] (0/4) Epoch 9, batch 2800, loss[loss=0.1964, simple_loss=0.2546, pruned_loss=0.0691, over 23834.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.28, pruned_loss=0.07331, over 4711334.56 frames. ], batch size: 179, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:08:08,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:08:10,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 08:08:10,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:11,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:13,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:14,769 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 08:08:14,770 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 08:08:17,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:21,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:08:21,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:08:26,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:08:26,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 08:08:29,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:08:30,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 08:08:31,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:31,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:08:31,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:31,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=302040.0, ans=0.0 2023-09-29 08:08:36,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:08:36,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:36,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:08:38,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:08:47,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:08:49,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:51,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:54,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:56,486 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=302173.3333333333, ans=0.125 2023-09-29 08:08:59,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:08:59,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 08:08:59,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:01,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:01,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:09:04,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:05,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:10,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:09:12,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:09:12,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:12,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:09:12,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:09:14,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:09:15,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:09:15,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 08:09:15,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:17,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:09:17,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:19,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 08:09:21,596 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=11.16 vs. limit=15.0 2023-09-29 08:09:21,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:21,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:09:22,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:09:23,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 08:09:30,409 INFO [train.py:1039] (0/4) Epoch 9, batch 2850, loss[loss=0.1921, simple_loss=0.2623, pruned_loss=0.06091, over 24274.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2788, pruned_loss=0.07194, over 4711335.01 frames. ], batch size: 61, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:09:30,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:30,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:09:32,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:09:33,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:37,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:09:37,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:09:38,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:40,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:41,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:44,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:09:44,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 08:09:50,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 08:09:50,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:53,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 08:09:54,128 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=302373.3333333333, ans=0.125 2023-09-29 08:09:55,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:56,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 08:09:59,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 08:10:00,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:09,131 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=7.30 vs. limit=12.0 2023-09-29 08:10:13,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:14,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:14,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:10:16,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:10:16,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:10:16,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:10:18,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:10:18,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 08:10:19,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:10:21,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:10:21,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:21,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:24,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:24,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:25,193 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=302506.6666666667, ans=0.0 2023-09-29 08:10:26,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:28,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:28,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=302506.6666666667, ans=0.1 2023-09-29 08:10:29,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:10:31,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:33,247 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.10 vs. limit=15.0 2023-09-29 08:10:34,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:36,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:10:41,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:10:43,749 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.988e+02 2.184e+02 2.463e+02 3.940e+02, threshold=4.369e+02, percent-clipped=0.0 2023-09-29 08:10:43,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 08:10:43,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 08:10:44,587 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.32 vs. limit=22.5 2023-09-29 08:10:45,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:10:47,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:47,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 08:10:49,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:10:49,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:49,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:10:50,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:10:50,640 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 08:10:50,729 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 08:10:50,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:10:51,989 INFO [train.py:1039] (0/4) Epoch 9, batch 2900, loss[loss=0.2207, simple_loss=0.2784, pruned_loss=0.08146, over 23709.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2786, pruned_loss=0.07177, over 4719053.87 frames. ], batch size: 232, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:10:52,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:57,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:10:57,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:10:59,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:10:59,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 08:11:04,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:04,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 08:11:06,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 08:11:06,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:11:06,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:11:07,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:08,164 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=302706.6666666667, ans=0.125 2023-09-29 08:11:10,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:11:11,836 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=302706.6666666667, ans=0.5 2023-09-29 08:11:14,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:11:14,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:17,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:11:18,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 08:11:20,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:11:20,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:23,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 08:11:25,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 08:11:28,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:11:28,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 08:11:28,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:11:32,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:11:32,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:11:33,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:35,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:39,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:11:41,233 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.15 vs. limit=15.0 2023-09-29 08:11:41,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:11:42,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 08:11:44,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 08:11:44,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:11:48,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:11:50,547 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:11:51,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 08:11:53,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:11:57,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:12:05,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=302906.6666666667, ans=0.125 2023-09-29 08:12:08,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:12:08,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:12:09,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 08:12:14,610 INFO [train.py:1039] (0/4) Epoch 9, batch 2950, loss[loss=0.1987, simple_loss=0.2799, pruned_loss=0.05876, over 24432.00 frames. ], tot_loss[loss=0.2117, simple_loss=0.2793, pruned_loss=0.07205, over 4724689.93 frames. ], batch size: 66, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:12:14,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:14,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 08:12:14,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:16,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:12:21,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:23,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 08:12:24,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:24,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:26,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:12:27,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:12:29,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 08:12:30,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 08:12:30,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:12:30,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:33,260 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=303040.0, ans=0.125 2023-09-29 08:12:36,969 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.26 vs. limit=22.5 2023-09-29 08:12:37,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:12:39,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:12:40,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=303040.0, ans=15.0 2023-09-29 08:12:41,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:12:41,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:12:46,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:12:46,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:12:47,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:12:51,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=303106.6666666667, ans=0.0 2023-09-29 08:12:52,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 08:12:57,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 08:12:57,985 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 08:12:59,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:13:00,863 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 08:13:02,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 08:13:02,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:03,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:13:03,908 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 08:13:03,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:13:08,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 08:13:08,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:13:10,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:13:14,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:15,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:13:16,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:16,445 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 08:13:16,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:16,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 08:13:22,513 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.89 vs. limit=6.0 2023-09-29 08:13:23,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:23,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:13:23,776 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=303240.0, ans=0.125 2023-09-29 08:13:25,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 08:13:25,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:13:26,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 08:13:28,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:29,731 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.974e+02 2.174e+02 2.569e+02 4.331e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-29 08:13:30,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:13:31,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:13:33,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:33,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:13:35,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:13:36,351 INFO [train.py:1039] (0/4) Epoch 9, batch 3000, loss[loss=0.229, simple_loss=0.2996, pruned_loss=0.07921, over 23327.00 frames. ], tot_loss[loss=0.2126, simple_loss=0.2806, pruned_loss=0.07231, over 4734966.53 frames. ], batch size: 93, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:13:36,352 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 08:13:49,673 INFO [train.py:1071] (0/4) Epoch 9, validation: loss=0.2838, simple_loss=0.2753, pruned_loss=0.1462, over 1125622.00 frames. 2023-09-29 08:13:49,674 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 08:13:49,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:49,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:13:49,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:13:50,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:52,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:13:53,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:53,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 08:13:55,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:55,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=303306.6666666667, ans=0.2 2023-09-29 08:13:58,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:59,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:14:01,576 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 08:14:01,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 08:14:05,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:14:06,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:14:06,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 08:14:08,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:10,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=303373.3333333333, ans=0.125 2023-09-29 08:14:14,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:14:21,352 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:14:25,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:14:27,330 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=303440.0, ans=0.0 2023-09-29 08:14:30,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 08:14:31,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:14:33,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:14:33,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:33,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:14:36,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:37,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 08:14:40,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 08:14:40,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:14:41,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:14:43,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:14:43,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:44,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:14:44,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:14:49,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:14:49,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:49,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:14:52,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:55,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 08:14:57,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:14:58,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:14:58,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:15:02,506 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.95 vs. limit=15.0 2023-09-29 08:15:03,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:03,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:04,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:15:04,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 08:15:06,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:06,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 08:15:06,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:15:07,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 08:15:10,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:10,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:15:10,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 08:15:12,249 INFO [train.py:1039] (0/4) Epoch 9, batch 3050, loss[loss=0.1817, simple_loss=0.2579, pruned_loss=0.05274, over 24453.00 frames. ], tot_loss[loss=0.2141, simple_loss=0.2817, pruned_loss=0.07325, over 4714443.73 frames. ], batch size: 63, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:15:12,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 08:15:12,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:15:13,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:15:15,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:15,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:15:15,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:16,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:15:19,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 08:15:20,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:15:24,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:24,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:15:26,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=303640.0, ans=0.1 2023-09-29 08:15:29,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:31,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 08:15:38,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 08:15:39,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 08:15:39,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:42,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:15:44,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=303773.3333333333, ans=0.125 2023-09-29 08:15:45,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:45,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:45,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:49,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:15:50,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:50,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:50,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:50,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:52,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:54,757 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.53 vs. limit=6.0 2023-09-29 08:15:56,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:59,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:00,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 08:16:00,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:16:00,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:16:02,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=303840.0, ans=0.0 2023-09-29 08:16:04,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:16:06,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:16:06,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:07,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:12,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:16:12,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=303840.0, ans=0.125 2023-09-29 08:16:12,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=303840.0, ans=0.0 2023-09-29 08:16:13,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:13,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=303840.0, ans=0.2 2023-09-29 08:16:18,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:19,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:16:19,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:21,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:21,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:16:21,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:16:23,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 08:16:25,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:26,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:26,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 08:16:28,063 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 1.995e+02 2.261e+02 2.647e+02 3.760e+02, threshold=4.522e+02, percent-clipped=0.0 2023-09-29 08:16:28,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:35,197 INFO [train.py:1039] (0/4) Epoch 9, batch 3100, loss[loss=0.2202, simple_loss=0.268, pruned_loss=0.08623, over 23358.00 frames. ], tot_loss[loss=0.2147, simple_loss=0.2816, pruned_loss=0.07391, over 4699351.68 frames. ], batch size: 285, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:16:35,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:36,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:16:40,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:16:42,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 08:16:44,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 08:16:45,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 08:16:47,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:16:50,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:50,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:53,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:16:58,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:01,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=304040.0, ans=0.5 2023-09-29 08:17:03,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 08:17:03,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=304040.0, ans=0.1 2023-09-29 08:17:08,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:17:10,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:10,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:10,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:17:12,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:17:14,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:17:14,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 08:17:14,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:17:15,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:17,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 08:17:18,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:17:23,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:17:24,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 08:17:24,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 08:17:26,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:26,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:29,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:29,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:29,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:17:31,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:17:31,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:17:33,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:17:33,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:17:33,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:33,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:17:37,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=304173.3333333333, ans=0.04949747468305833 2023-09-29 08:17:38,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:39,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 08:17:42,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:17:44,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 08:17:45,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:45,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:47,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 08:17:55,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 08:17:57,808 INFO [train.py:1039] (0/4) Epoch 9, batch 3150, loss[loss=0.2091, simple_loss=0.2696, pruned_loss=0.07434, over 23812.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.2799, pruned_loss=0.07295, over 4712481.27 frames. ], batch size: 179, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:17:57,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:17:59,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:01,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:18:01,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:18:02,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 08:18:04,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:04,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:18:07,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 08:18:09,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:11,206 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 08:18:14,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 08:18:14,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:18:14,421 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 08:18:15,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:18:16,126 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=304373.3333333333, ans=0.0 2023-09-29 08:18:18,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 08:18:18,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=304373.3333333333, ans=0.125 2023-09-29 08:18:19,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 08:18:19,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 08:18:19,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:19,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:20,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:21,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=304373.3333333333, ans=0.125 2023-09-29 08:18:23,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 08:18:26,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:26,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:27,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:29,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:18:32,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 08:18:33,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:18:36,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:18:38,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:38,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 08:18:40,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 08:18:41,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:18:42,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:18:42,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:18:42,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:42,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:18:45,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:18:45,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:18:45,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 08:18:46,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:18:47,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:48,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:18:49,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:51,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 08:18:51,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:18:53,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 08:18:53,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:55,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 08:18:57,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 08:18:57,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:18:58,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:00,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 08:19:01,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 08:19:01,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:19:03,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:19:05,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:06,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:19:13,058 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 2.121e+02 2.392e+02 3.280e+02 6.565e+02, threshold=4.784e+02, percent-clipped=9.0 2023-09-29 08:19:13,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:19:13,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:16,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 08:19:19,826 INFO [train.py:1039] (0/4) Epoch 9, batch 3200, loss[loss=0.185, simple_loss=0.2669, pruned_loss=0.0516, over 24471.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2791, pruned_loss=0.07253, over 4697168.82 frames. ], batch size: 66, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:19:21,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:19:21,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:19:21,941 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=304640.0, ans=0.0 2023-09-29 08:19:26,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:28,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:19:28,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 08:19:30,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:36,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:19:38,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:48,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:19:48,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=304706.6666666667, ans=0.125 2023-09-29 08:19:53,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=304773.3333333333, ans=0.0 2023-09-29 08:19:53,769 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=304773.3333333333, ans=0.0 2023-09-29 08:19:58,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 08:19:58,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:20:01,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 08:20:02,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:20:07,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:20:07,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:20:08,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:20:13,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 08:20:13,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:20:16,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 08:20:20,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 08:20:23,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:20:28,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:28,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:20:29,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:29,730 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 08:20:29,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:20:32,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:20:36,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 08:20:36,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 08:20:38,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 08:20:39,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 08:20:41,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:20:43,060 INFO [train.py:1039] (0/4) Epoch 9, batch 3250, loss[loss=0.2266, simple_loss=0.2862, pruned_loss=0.08353, over 23717.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2794, pruned_loss=0.07193, over 4709541.13 frames. ], batch size: 212, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:20:43,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:20:44,811 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 08:20:44,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:20:44,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:20:45,044 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 08:20:45,796 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.51 vs. limit=15.0 2023-09-29 08:20:47,536 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.78 vs. limit=15.0 2023-09-29 08:20:48,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=304973.3333333333, ans=0.1 2023-09-29 08:20:49,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:20:52,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:20:52,962 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=304973.3333333333, ans=0.125 2023-09-29 08:21:02,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:02,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 08:21:02,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:04,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:21:04,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:05,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:05,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:21:09,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:09,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:21:11,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:11,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:15,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:17,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:18,106 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=12.0 2023-09-29 08:21:18,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:18,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:21,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:21,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:21,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:21,910 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.02 vs. limit=15.0 2023-09-29 08:21:23,376 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.22 vs. limit=12.0 2023-09-29 08:21:26,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 08:21:26,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:21:26,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:21:28,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:28,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:21:34,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:21:36,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=305173.3333333333, ans=10.0 2023-09-29 08:21:38,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=305173.3333333333, ans=0.125 2023-09-29 08:21:46,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:21:46,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:46,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 08:21:46,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:21:46,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:21:47,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:49,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 08:21:49,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 08:21:50,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:50,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:52,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:52,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:21:54,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:57,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:58,885 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.015e+02 2.327e+02 2.716e+02 4.299e+02, threshold=4.655e+02, percent-clipped=0.0 2023-09-29 08:21:59,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:01,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 08:22:01,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:01,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=305240.0, ans=0.0 2023-09-29 08:22:02,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:22:02,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 08:22:06,380 INFO [train.py:1039] (0/4) Epoch 9, batch 3300, loss[loss=0.2048, simple_loss=0.2772, pruned_loss=0.0662, over 24496.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2794, pruned_loss=0.07139, over 4719124.82 frames. ], batch size: 66, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:22:06,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:22:06,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 08:22:09,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 08:22:11,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 08:22:11,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:17,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:18,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:22:18,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:19,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:22:21,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:22:21,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:22,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:22:23,583 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.98 vs. limit=15.0 2023-09-29 08:22:27,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 08:22:29,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:22:29,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:30,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:32,110 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 08:22:34,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:22:34,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:22:35,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:22:35,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:22:35,942 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 08:22:39,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:39,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:22:41,767 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.04 vs. limit=22.5 2023-09-29 08:22:42,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:42,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 08:22:44,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 08:22:44,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:44,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:22:47,404 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 08:22:48,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 08:22:48,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:22:52,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 08:22:54,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:22:57,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:22:58,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:00,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:01,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:01,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:23:01,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:23:03,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:23:03,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:05,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:23:07,115 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 08:23:09,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 08:23:10,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:23:10,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:10,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:13,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:13,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:15,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:23:17,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:17,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:23:18,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:19,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=305573.3333333333, ans=0.0 2023-09-29 08:23:20,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:23:23,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 08:23:23,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:23,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:25,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:23:25,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:23:28,432 INFO [train.py:1039] (0/4) Epoch 9, batch 3350, loss[loss=0.212, simple_loss=0.2923, pruned_loss=0.0658, over 24433.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.2814, pruned_loss=0.07256, over 4714563.47 frames. ], batch size: 69, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:23:28,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:30,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:30,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:33,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:34,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:36,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:23:38,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:38,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:23:40,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:42,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:23:44,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 08:23:45,638 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 08:23:45,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:49,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 08:23:49,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 08:23:49,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:23:49,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:23:52,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:52,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 08:23:53,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:53,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:23:56,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:59,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:59,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:24:00,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:24:05,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:06,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:07,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:11,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:24:13,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:24:16,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:16,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:19,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:21,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 08:24:22,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=305840.0, ans=0.2 2023-09-29 08:24:23,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:24:23,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 08:24:23,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:24:24,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 08:24:25,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:27,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:33,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:33,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=305906.6666666667, ans=0.125 2023-09-29 08:24:34,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 08:24:34,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:24:37,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:24:37,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:24:42,631 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.16 vs. limit=15.0 2023-09-29 08:24:43,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:24:44,740 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 2.029e+02 2.244e+02 2.615e+02 3.935e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 08:24:44,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 08:24:46,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:24:46,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:24:47,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=305906.6666666667, ans=0.0 2023-09-29 08:24:48,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=305906.6666666667, ans=0.125 2023-09-29 08:24:49,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:50,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 08:24:50,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:51,468 INFO [train.py:1039] (0/4) Epoch 9, batch 3400, loss[loss=0.2232, simple_loss=0.2791, pruned_loss=0.08362, over 23670.00 frames. ], tot_loss[loss=0.2149, simple_loss=0.2823, pruned_loss=0.07373, over 4708315.70 frames. ], batch size: 232, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:24:51,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 08:24:53,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:24:55,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:24:56,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 08:25:00,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 08:25:00,147 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 08:25:00,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:05,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:25:05,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:25:05,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=305973.3333333333, ans=0.0 2023-09-29 08:25:06,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:08,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:25:08,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=306040.0, ans=0.2 2023-09-29 08:25:13,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:14,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=306040.0, ans=0.125 2023-09-29 08:25:16,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 08:25:23,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:25:24,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:25,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:26,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:25:27,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=306106.6666666667, ans=0.0 2023-09-29 08:25:29,972 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:25:34,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:25:38,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 08:25:40,520 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=306173.3333333333, ans=0.1 2023-09-29 08:25:41,930 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=306173.3333333333, ans=0.0 2023-09-29 08:25:46,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 08:25:47,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:25:47,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:48,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:48,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:25:51,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:54,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:25:54,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:26:01,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:03,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 08:26:09,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:26:13,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 08:26:15,220 INFO [train.py:1039] (0/4) Epoch 9, batch 3450, loss[loss=0.1957, simple_loss=0.2636, pruned_loss=0.06394, over 20664.00 frames. ], tot_loss[loss=0.2138, simple_loss=0.2816, pruned_loss=0.07301, over 4716487.36 frames. ], batch size: 45, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:26:16,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 08:26:18,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:26:19,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:26:19,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 08:26:20,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:23,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:26:30,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=306373.3333333333, ans=0.07 2023-09-29 08:26:32,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:26:33,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:33,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:26:33,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:38,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:38,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=306373.3333333333, ans=0.1 2023-09-29 08:26:45,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 08:26:50,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 08:26:50,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:26:50,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:26:50,863 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=306440.0, ans=0.125 2023-09-29 08:26:52,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:57,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 08:26:58,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:27:02,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:02,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:27:03,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:27:05,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:27:07,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 08:27:07,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:07,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:27:10,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:27:13,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 08:27:18,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:27:22,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:27:22,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:27,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:32,306 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.104e+02 2.323e+02 2.853e+02 3.879e+02, threshold=4.645e+02, percent-clipped=0.0 2023-09-29 08:27:32,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:32,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:32,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:27:32,899 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=306573.3333333333, ans=0.125 2023-09-29 08:27:32,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=306573.3333333333, ans=0.2 2023-09-29 08:27:34,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:34,909 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=14.21 vs. limit=15.0 2023-09-29 08:27:38,560 INFO [train.py:1039] (0/4) Epoch 9, batch 3500, loss[loss=0.1738, simple_loss=0.2415, pruned_loss=0.05306, over 24369.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2797, pruned_loss=0.07203, over 4713033.46 frames. ], batch size: 56, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:27:38,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:44,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:27:44,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 08:27:47,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:27:51,211 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=306640.0, ans=0.125 2023-09-29 08:27:52,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:27:53,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:53,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 08:27:58,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:28:00,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:28:02,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:28:02,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:03,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:28:03,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:03,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:05,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 08:28:08,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:09,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:28:11,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:14,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:14,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=306773.3333333333, ans=0.125 2023-09-29 08:28:15,601 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.84 vs. limit=10.0 2023-09-29 08:28:16,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 08:28:16,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:20,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:20,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:28:21,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:23,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:28:25,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:25,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 08:28:28,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 08:28:28,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 08:28:28,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:29,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:29,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:30,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:28:35,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:28:35,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:28:40,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:28:41,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 08:28:41,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 08:28:41,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:28:41,753 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=306840.0, ans=0.0 2023-09-29 08:28:45,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:28:45,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:47,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:49,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 08:28:50,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:52,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:54,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 08:28:56,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 08:29:00,989 INFO [train.py:1039] (0/4) Epoch 9, batch 3550, loss[loss=0.209, simple_loss=0.2708, pruned_loss=0.07356, over 23380.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2777, pruned_loss=0.07143, over 4706141.45 frames. ], batch size: 105, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:29:01,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:01,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=306973.3333333333, ans=0.125 2023-09-29 08:29:02,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:29:02,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:04,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:05,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:29:13,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:15,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:29:17,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:19,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:29:20,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:22,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:29:22,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:29:25,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:25,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:29:25,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:26,247 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=15.0 2023-09-29 08:29:27,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:29:27,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:29:31,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=307040.0, ans=0.0 2023-09-29 08:29:34,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:29:34,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:35,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:35,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:35,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:29:36,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 08:29:37,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:29:47,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:47,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:47,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:50,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 08:29:50,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:29:51,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 08:29:53,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:56,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:29:56,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:30:01,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 08:30:01,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:08,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:10,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 08:30:10,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:15,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:30:15,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 08:30:17,026 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.963e+02 2.242e+02 2.644e+02 4.260e+02, threshold=4.484e+02, percent-clipped=0.0 2023-09-29 08:30:17,647 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:30:21,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 08:30:22,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:30:22,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:30:23,415 INFO [train.py:1039] (0/4) Epoch 9, batch 3600, loss[loss=0.2285, simple_loss=0.2833, pruned_loss=0.08687, over 23455.00 frames. ], tot_loss[loss=0.2095, simple_loss=0.2771, pruned_loss=0.07089, over 4716939.06 frames. ], batch size: 285, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:30:24,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:25,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:26,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:30:29,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:32,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:30:32,955 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=307306.6666666667, ans=0.0 2023-09-29 08:30:34,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:30:36,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:36,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 08:30:37,215 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.43 vs. limit=15.0 2023-09-29 08:30:40,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:30:40,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:43,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:46,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:50,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:30:50,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:50,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 08:30:51,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:54,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:56,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:30:57,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:59,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:59,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:00,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 08:31:07,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=307440.0, ans=0.1 2023-09-29 08:31:07,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=307440.0, ans=0.1 2023-09-29 08:31:08,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:10,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:31:12,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 08:31:12,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=307506.6666666667, ans=0.125 2023-09-29 08:31:16,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:31:22,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:23,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=307506.6666666667, ans=0.125 2023-09-29 08:31:25,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:31,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:31:31,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:31:31,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 08:31:32,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=307573.3333333333, ans=0.125 2023-09-29 08:31:32,747 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.09 vs. limit=22.5 2023-09-29 08:31:33,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 08:31:35,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 08:31:36,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:38,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:31:38,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 08:31:39,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:31:39,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:31:39,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:40,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=307573.3333333333, ans=0.125 2023-09-29 08:31:41,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 08:31:41,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 08:31:45,441 INFO [train.py:1039] (0/4) Epoch 9, batch 3650, loss[loss=0.2112, simple_loss=0.2774, pruned_loss=0.07247, over 23755.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2781, pruned_loss=0.07124, over 4723176.98 frames. ], batch size: 212, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:31:45,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:45,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=307640.0, ans=0.0 2023-09-29 08:31:47,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 08:31:51,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 08:31:52,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:31:56,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 08:31:57,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 08:32:00,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:02,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:32:02,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:32:06,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:32:08,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:32:09,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 08:32:09,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:32:09,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:11,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 08:32:11,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:32:11,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:32:11,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:14,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:32:17,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 08:32:17,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 08:32:19,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:32:22,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 08:32:24,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:25,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:32:26,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=307773.3333333333, ans=0.125 2023-09-29 08:32:30,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:32:33,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:33,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:32:35,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:32:35,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:32:36,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:32:37,611 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.12 vs. limit=15.0 2023-09-29 08:32:41,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:43,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:43,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:44,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:32:46,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:46,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:32:53,226 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.34 vs. limit=22.5 2023-09-29 08:32:54,761 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 08:33:00,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:00,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:01,736 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.073e+02 2.361e+02 2.805e+02 4.754e+02, threshold=4.723e+02, percent-clipped=2.0 2023-09-29 08:33:01,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:33:01,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:02,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=307906.6666666667, ans=0.125 2023-09-29 08:33:03,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:33:05,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:07,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 08:33:07,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:08,645 INFO [train.py:1039] (0/4) Epoch 9, batch 3700, loss[loss=0.2072, simple_loss=0.2774, pruned_loss=0.06845, over 24480.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2792, pruned_loss=0.07172, over 4726106.65 frames. ], batch size: 63, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:33:10,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:33:11,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:33:11,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:33:12,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:12,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 08:33:12,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:14,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:33:14,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:33:17,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:33:19,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:19,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:21,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:33:21,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:22,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:33:24,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:26,588 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 08:33:35,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:33:35,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:33:38,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:33:38,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 08:33:38,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=308040.0, ans=0.125 2023-09-29 08:33:40,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:43,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:44,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 08:33:45,498 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.31 vs. limit=15.0 2023-09-29 08:33:46,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:46,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=308106.6666666667, ans=0.1 2023-09-29 08:33:47,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:33:50,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:50,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:33:52,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:33:55,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:55,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 08:33:57,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:57,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 08:34:03,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:34:04,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:34:07,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:09,160 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.00 vs. limit=15.0 2023-09-29 08:34:09,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 08:34:12,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:34:12,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:34:13,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:13,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:16,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:18,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 08:34:19,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 08:34:19,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:34:19,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:22,000 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=15.0 2023-09-29 08:34:22,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:34:24,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:34:25,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:34:27,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:34:28,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:34:30,327 INFO [train.py:1039] (0/4) Epoch 9, batch 3750, loss[loss=0.2075, simple_loss=0.2915, pruned_loss=0.06173, over 24010.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.2812, pruned_loss=0.07243, over 4727790.19 frames. ], batch size: 80, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:34:30,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 08:34:32,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:34:33,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:34:35,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 08:34:35,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:34:38,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:38,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:39,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:34:42,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:34:47,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:34:48,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=308373.3333333333, ans=0.125 2023-09-29 08:34:49,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:34:49,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:54,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:34:54,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 08:34:57,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:34:59,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:34:59,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:35:02,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 08:35:05,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 08:35:07,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:35:08,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:35:11,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:35:15,874 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=308440.0, ans=0.0 2023-09-29 08:35:17,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:18,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:35:23,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 08:35:23,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=308506.6666666667, ans=0.0 2023-09-29 08:35:27,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:30,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:35:30,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:35:35,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:35:39,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:35:41,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:35:41,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=308573.3333333333, ans=0.125 2023-09-29 08:35:43,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:35:45,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:35:46,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:35:48,984 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.167e+02 2.525e+02 3.168e+02 5.587e+02, threshold=5.051e+02, percent-clipped=3.0 2023-09-29 08:35:51,559 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.37 vs. limit=15.0 2023-09-29 08:35:53,540 INFO [train.py:1039] (0/4) Epoch 9, batch 3800, loss[loss=0.2268, simple_loss=0.2694, pruned_loss=0.09211, over 20016.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2807, pruned_loss=0.07208, over 4725435.48 frames. ], batch size: 388, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:35:57,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:36:01,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:02,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:36:02,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 08:36:04,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:07,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:08,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:36:10,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 08:36:10,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:11,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:36:13,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:14,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:36:14,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:16,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 08:36:19,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 08:36:21,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:36:24,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:27,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:36:27,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:36:30,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:36:30,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:34,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:34,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:34,793 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=308773.3333333333, ans=0.1 2023-09-29 08:36:39,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:36:39,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 08:36:41,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:36:47,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:36:53,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:36:55,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 08:36:57,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 08:36:59,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:00,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:37:02,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:03,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 08:37:07,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 08:37:07,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 08:37:08,182 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.34 vs. limit=15.0 2023-09-29 08:37:09,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:09,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:37:15,127 INFO [train.py:1039] (0/4) Epoch 9, batch 3850, loss[loss=0.2131, simple_loss=0.2817, pruned_loss=0.07229, over 24451.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2794, pruned_loss=0.07165, over 4720650.98 frames. ], batch size: 58, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:37:15,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:37:15,980 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.78 vs. limit=22.5 2023-09-29 08:37:16,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:37:21,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:37:21,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 08:37:25,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:37:25,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:28,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:37:30,195 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=309040.0, ans=0.2 2023-09-29 08:37:32,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:33,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:37:33,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 08:37:40,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:43,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:44,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=309040.0, ans=0.07 2023-09-29 08:37:46,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:37:47,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:37:51,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:51,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:37:51,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:53,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:37:53,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:54,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:54,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:56,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:37:58,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 08:37:58,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 08:37:59,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:01,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:04,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:06,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:06,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 08:38:08,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 08:38:09,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:11,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 08:38:15,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:38:19,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:20,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:24,677 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.50 vs. limit=15.0 2023-09-29 08:38:25,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:25,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 08:38:28,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 08:38:30,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:31,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:32,100 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=309240.0, ans=6.0 2023-09-29 08:38:33,498 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 1.974e+02 2.231e+02 2.579e+02 4.458e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 08:38:33,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:38:33,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:38:35,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:38:35,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 08:38:38,206 INFO [train.py:1039] (0/4) Epoch 9, batch 3900, loss[loss=0.1985, simple_loss=0.2628, pruned_loss=0.06712, over 23561.00 frames. ], tot_loss[loss=0.2096, simple_loss=0.2776, pruned_loss=0.07079, over 4712906.30 frames. ], batch size: 285, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:38:38,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:38,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 08:38:38,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:38,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:40,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:38:40,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:41,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=309306.6666666667, ans=0.125 2023-09-29 08:38:42,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:38:44,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:44,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:44,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:38:44,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 08:38:45,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:48,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:50,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:51,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:38:51,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:56,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:56,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:57,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:38:59,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 08:38:59,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:01,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 08:39:02,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:39:02,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 08:39:05,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 08:39:08,515 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=22.5 2023-09-29 08:39:09,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:10,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:39:10,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:39:12,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:17,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:19,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:39:19,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=309440.0, ans=0.125 2023-09-29 08:39:22,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:39:22,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:24,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:39:30,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:30,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:39:32,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=309506.6666666667, ans=0.125 2023-09-29 08:39:36,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:39:38,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:39:43,848 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=309573.3333333333, ans=0.2 2023-09-29 08:39:43,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=309573.3333333333, ans=0.125 2023-09-29 08:39:48,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=309573.3333333333, ans=0.125 2023-09-29 08:39:50,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:39:52,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:52,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 08:39:52,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 08:39:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:55,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 08:39:57,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:59,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 08:40:02,375 INFO [train.py:1039] (0/4) Epoch 9, batch 3950, loss[loss=0.1989, simple_loss=0.2787, pruned_loss=0.0596, over 24459.00 frames. ], tot_loss[loss=0.2092, simple_loss=0.2771, pruned_loss=0.07068, over 4713371.11 frames. ], batch size: 66, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:40:05,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:40:07,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 08:40:07,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:40:10,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:40:10,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:40:15,762 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 08:40:17,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:17,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 08:40:18,763 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 08:40:18,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:40:22,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:23,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:40:23,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:26,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 08:40:27,157 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=309706.6666666667, ans=0.0 2023-09-29 08:40:30,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:40:31,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:31,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:40:31,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:40:34,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:40:46,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:40:46,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:40:47,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=309773.3333333333, ans=10.0 2023-09-29 08:40:48,583 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.37 vs. limit=12.0 2023-09-29 08:40:52,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 08:40:53,119 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.75 vs. limit=22.5 2023-09-29 08:40:56,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=309840.0, ans=0.125 2023-09-29 08:40:58,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=309840.0, ans=0.1 2023-09-29 08:40:59,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 08:40:59,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 08:40:59,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:01,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:41:11,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:41:11,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:41:11,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:11,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:41:12,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=309906.6666666667, ans=0.0 2023-09-29 08:41:13,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 08:41:16,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:41:17,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:41:18,488 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.18 vs. limit=15.0 2023-09-29 08:41:19,317 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.102e+02 2.264e+02 2.656e+02 4.963e+02, threshold=4.527e+02, percent-clipped=1.0 2023-09-29 08:41:22,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 08:41:23,814 INFO [train.py:1039] (0/4) Epoch 9, batch 4000, loss[loss=0.2205, simple_loss=0.2785, pruned_loss=0.08123, over 23648.00 frames. ], tot_loss[loss=0.2095, simple_loss=0.2779, pruned_loss=0.07058, over 4716034.93 frames. ], batch size: 256, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:41:25,932 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=309973.3333333333, ans=0.125 2023-09-29 08:41:32,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:41,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:46,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:47,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:41:47,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:47,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 08:41:49,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:41:49,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 08:41:49,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:41:49,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 08:41:51,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=310040.0, ans=0.125 2023-09-29 08:41:52,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:55,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:41:55,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:41:55,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:57,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:57,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:41:59,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:41:59,376 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 08:41:59,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:42:00,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:04,710 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 08:42:04,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:42:04,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:13,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 08:42:13,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:42:15,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:42:16,746 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 08:42:18,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:42:18,687 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=310173.3333333333, ans=0.125 2023-09-29 08:42:19,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 08:42:19,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:42:19,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:22,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:42:23,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:42:24,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:42:24,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:25,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 08:42:26,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:28,139 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 08:42:32,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:42:37,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:42:39,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:42:40,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:40,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:42:42,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:42:44,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=310240.0, ans=0.2 2023-09-29 08:42:47,931 INFO [train.py:1039] (0/4) Epoch 9, batch 4050, loss[loss=0.1958, simple_loss=0.2803, pruned_loss=0.05565, over 24644.00 frames. ], tot_loss[loss=0.2088, simple_loss=0.2776, pruned_loss=0.06998, over 4727157.86 frames. ], batch size: 68, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:42:48,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:49,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:42:51,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 08:42:52,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:42:52,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:42:54,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:42:54,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:42:55,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:43:01,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:43:01,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=310306.6666666667, ans=0.125 2023-09-29 08:43:01,545 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.25 vs. limit=6.0 2023-09-29 08:43:04,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:05,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:43:07,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:43:08,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:43:11,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:15,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:43:17,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 08:43:19,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 08:43:19,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=310440.0, ans=0.125 2023-09-29 08:43:21,188 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 08:43:22,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:43:25,100 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.63 vs. limit=22.5 2023-09-29 08:43:29,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 08:43:30,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:43:34,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:37,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:38,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:43:40,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:40,966 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.20 vs. limit=6.0 2023-09-29 08:43:41,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:46,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 08:43:46,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:43:46,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=310506.6666666667, ans=0.125 2023-09-29 08:43:47,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:43:50,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 08:43:56,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:44:02,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 08:44:03,052 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:44:04,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:04,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:44:05,847 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.021e+02 2.194e+02 2.667e+02 4.003e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-29 08:44:06,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 08:44:06,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 08:44:06,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:09,642 INFO [train.py:1039] (0/4) Epoch 9, batch 4100, loss[loss=0.2125, simple_loss=0.2972, pruned_loss=0.06397, over 24647.00 frames. ], tot_loss[loss=0.2104, simple_loss=0.2791, pruned_loss=0.07088, over 4725954.81 frames. ], batch size: 73, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:44:09,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:09,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:09,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:44:13,584 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.13 vs. limit=12.0 2023-09-29 08:44:14,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=310640.0, ans=0.125 2023-09-29 08:44:17,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 08:44:19,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 08:44:22,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 08:44:23,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 08:44:23,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:25,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:44:26,745 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 08:44:27,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=310706.6666666667, ans=0.0 2023-09-29 08:44:30,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:31,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:44:31,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:33,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:44:35,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:44:36,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:36,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:44:36,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 08:44:38,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:38,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:44:38,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:38,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:44:38,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 08:44:38,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=310706.6666666667, ans=0.125 2023-09-29 08:44:41,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:44:45,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 08:44:46,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:49,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:49,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 08:44:53,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:53,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:44:53,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:44:56,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 08:44:57,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:44:57,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:45:00,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 08:45:01,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:01,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:03,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=310840.0, ans=0.0 2023-09-29 08:45:04,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:07,524 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.43 vs. limit=22.5 2023-09-29 08:45:09,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:14,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:14,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:45:20,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=310906.6666666667, ans=0.125 2023-09-29 08:45:22,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:22,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:28,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:31,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:45:33,037 INFO [train.py:1039] (0/4) Epoch 9, batch 4150, loss[loss=0.2165, simple_loss=0.2891, pruned_loss=0.07197, over 23380.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2789, pruned_loss=0.07086, over 4725956.41 frames. ], batch size: 93, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:45:33,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:33,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=310973.3333333333, ans=0.015 2023-09-29 08:45:34,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:45:34,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:45:34,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:38,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 08:45:38,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:39,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 08:45:40,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 08:45:41,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 08:45:41,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:47,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:45:47,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:48,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=311040.0, ans=0.1 2023-09-29 08:45:52,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:53,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:45:54,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:45:57,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:45:57,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:58,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:46:01,120 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.05 vs. limit=6.0 2023-09-29 08:46:04,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:05,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=311106.6666666667, ans=0.125 2023-09-29 08:46:08,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:10,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 08:46:12,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 08:46:12,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:46:13,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 08:46:13,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:46:13,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:17,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:17,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:21,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 08:46:25,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=311173.3333333333, ans=0.0 2023-09-29 08:46:26,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:26,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:46:28,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 08:46:28,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:30,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 08:46:31,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:46:34,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:36,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:38,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 08:46:38,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:46:38,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:46:39,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:46:41,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 08:46:41,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=311240.0, ans=0.0 2023-09-29 08:46:43,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:43,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:46:43,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:46:43,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 08:46:43,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:45,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:46:46,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:46,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:46,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 08:46:48,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:51,307 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.017e+02 2.325e+02 2.759e+02 4.576e+02, threshold=4.650e+02, percent-clipped=1.0 2023-09-29 08:46:54,452 INFO [train.py:1039] (0/4) Epoch 9, batch 4200, loss[loss=0.2059, simple_loss=0.2593, pruned_loss=0.07623, over 23588.00 frames. ], tot_loss[loss=0.21, simple_loss=0.2777, pruned_loss=0.07113, over 4717939.09 frames. ], batch size: 256, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:46:54,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:46:56,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 08:46:57,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:46:59,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:47:01,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:47:02,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:02,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:03,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=311306.6666666667, ans=0.1 2023-09-29 08:47:05,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 08:47:09,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 08:47:10,087 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.14 vs. limit=22.5 2023-09-29 08:47:10,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:12,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:16,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:47:21,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:47:21,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:21,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:22,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 08:47:22,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:24,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:24,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:47:24,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:47:26,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:47:27,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 08:47:28,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:32,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:47:32,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:47:36,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:47:37,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:47:38,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=311440.0, ans=0.0 2023-09-29 08:47:40,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:47:40,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 08:47:40,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:47:42,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:47:46,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:47:50,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:53,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:47:58,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 08:48:00,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:04,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:48:06,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:07,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 08:48:14,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:48:18,099 INFO [train.py:1039] (0/4) Epoch 9, batch 4250, loss[loss=0.2174, simple_loss=0.2975, pruned_loss=0.0687, over 24584.00 frames. ], tot_loss[loss=0.2091, simple_loss=0.2767, pruned_loss=0.07078, over 4722388.99 frames. ], batch size: 71, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:48:19,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:48:19,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:48:22,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:26,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=311640.0, ans=0.0 2023-09-29 08:48:28,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:48:28,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 08:48:28,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:48:33,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:36,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:48:41,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:41,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:44,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:48:44,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:48:45,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:46,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:48,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:49,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:48:51,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:54,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 08:48:57,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 08:48:57,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:59,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:48:59,249 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=311773.3333333333, ans=0.1 2023-09-29 08:48:59,326 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=311773.3333333333, ans=0.1 2023-09-29 08:49:00,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:49:00,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:49:00,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:00,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:49:05,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:49:05,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:49:09,214 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.06 vs. limit=22.5 2023-09-29 08:49:11,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:12,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:13,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 08:49:13,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:49:14,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 08:49:16,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:49:19,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:49:20,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:20,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:49:22,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 08:49:24,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:49:24,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:49:28,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:32,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:33,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:49:33,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:35,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:36,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:49:38,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:49:38,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 08:49:39,724 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.120e+02 2.435e+02 2.958e+02 4.592e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-29 08:49:40,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:41,324 INFO [train.py:1039] (0/4) Epoch 9, batch 4300, loss[loss=0.2179, simple_loss=0.2742, pruned_loss=0.08076, over 23696.00 frames. ], tot_loss[loss=0.2088, simple_loss=0.2766, pruned_loss=0.07054, over 4731323.03 frames. ], batch size: 232, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:49:46,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:46,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:49:51,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:59,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:59,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 08:50:01,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:50:05,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:50:05,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:50:05,407 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 08:50:08,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:50:09,111 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.62 vs. limit=15.0 2023-09-29 08:50:10,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:13,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 08:50:14,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:50:14,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 08:50:16,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:50:17,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:50:20,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:50:20,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:50:22,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:50:24,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:25,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:50:25,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 08:50:25,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 08:50:28,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:50:30,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:50:30,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:31,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 08:50:31,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 08:50:32,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 08:50:34,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:50:36,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 08:50:36,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 08:50:40,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:42,274 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 08:50:43,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:50:46,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:46,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:48,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 08:50:48,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=312240.0, ans=0.125 2023-09-29 08:50:49,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:49,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:49,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:50:49,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:50:51,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:50:54,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:50:57,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:57,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:58,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:51:02,102 INFO [train.py:1039] (0/4) Epoch 9, batch 4350, loss[loss=0.1999, simple_loss=0.2649, pruned_loss=0.06743, over 23700.00 frames. ], tot_loss[loss=0.2098, simple_loss=0.2783, pruned_loss=0.07061, over 4743109.89 frames. ], batch size: 149, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:51:02,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=312306.6666666667, ans=0.0 2023-09-29 08:51:03,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 08:51:03,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:51:04,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=312306.6666666667, ans=0.125 2023-09-29 08:51:09,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:12,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:15,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:51:15,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:51:20,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:51:25,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:26,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:51:26,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:51:29,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:51:31,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:51:31,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=312373.3333333333, ans=0.125 2023-09-29 08:51:33,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:51:38,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 08:51:39,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:41,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:46,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:49,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 08:51:51,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:51:54,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:51:57,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=312506.6666666667, ans=0.2 2023-09-29 08:51:57,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=312506.6666666667, ans=0.0 2023-09-29 08:51:59,107 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 08:52:00,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:00,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:52:00,982 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=312506.6666666667, ans=0.125 2023-09-29 08:52:02,269 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 08:52:02,375 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 08:52:02,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:03,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:03,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:52:05,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:07,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:07,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:08,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 08:52:08,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:08,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:10,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:10,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 08:52:11,771 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 08:52:11,778 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 08:52:11,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 08:52:17,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:52:17,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:52:17,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:19,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:52:20,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 08:52:22,648 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.989e+02 2.158e+02 2.516e+02 4.089e+02, threshold=4.315e+02, percent-clipped=0.0 2023-09-29 08:52:22,922 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 08:52:22,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:24,247 INFO [train.py:1039] (0/4) Epoch 9, batch 4400, loss[loss=0.2416, simple_loss=0.3015, pruned_loss=0.09087, over 23750.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2795, pruned_loss=0.07075, over 4751240.81 frames. ], batch size: 164, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:52:25,430 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.54 vs. limit=6.0 2023-09-29 08:52:29,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:29,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:30,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:33,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 08:52:33,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 08:52:33,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 08:52:33,823 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 08:52:35,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:52:35,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:38,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 08:52:40,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=312706.6666666667, ans=0.125 2023-09-29 08:52:41,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:42,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:44,183 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 08:52:44,499 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=312706.6666666667, ans=0.09899494936611666 2023-09-29 08:52:47,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:47,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 08:52:48,006 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 08:52:51,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 08:52:53,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 08:52:53,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 08:52:53,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:53,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:53,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:54,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:57,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 08:52:57,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 08:52:57,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:57,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=312773.3333333333, ans=0.1 2023-09-29 08:53:02,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:53:02,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:02,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:03,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:53:03,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 08:53:05,433 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 08:53:08,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:15,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:53:17,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 08:53:21,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:53:23,792 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=312840.0, ans=0.125 2023-09-29 08:53:25,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:27,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:53:29,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 08:53:29,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:53:30,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:53:30,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:53:30,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:53:33,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 08:53:37,078 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=312906.6666666667, ans=0.125 2023-09-29 08:53:38,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 08:53:39,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 08:53:39,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:39,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 08:53:40,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:53:43,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:53:45,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 08:53:46,751 INFO [train.py:1039] (0/4) Epoch 9, batch 4450, loss[loss=0.2419, simple_loss=0.2942, pruned_loss=0.09478, over 23763.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2796, pruned_loss=0.07075, over 4746058.11 frames. ], batch size: 164, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:53:50,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:50,369 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=312973.3333333333, ans=0.125 2023-09-29 08:53:53,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:53,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:53:59,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:53:59,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:54:06,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:08,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:54:11,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:54:11,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:13,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 08:54:13,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:13,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:14,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:14,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:54:16,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:54:21,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:22,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:25,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:25,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:25,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:54:30,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:54:32,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 08:54:32,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 08:54:32,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:54:35,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:39,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 08:54:42,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:54:45,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:47,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 08:54:47,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:47,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:47,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:54:47,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:49,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:52,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:54:52,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 08:54:55,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:54:55,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:56,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:58,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:58,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:55:00,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:55:03,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 08:55:04,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:55:06,704 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.016e+02 2.496e+02 2.860e+02 5.111e+02, threshold=4.992e+02, percent-clipped=1.0 2023-09-29 08:55:07,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=313306.6666666667, ans=0.2 2023-09-29 08:55:08,150 INFO [train.py:1039] (0/4) Epoch 9, batch 4500, loss[loss=0.2922, simple_loss=0.3292, pruned_loss=0.1276, over 19796.00 frames. ], tot_loss[loss=0.2118, simple_loss=0.2804, pruned_loss=0.07159, over 4744191.15 frames. ], batch size: 388, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:55:09,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:12,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 08:55:12,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 08:55:12,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=313306.6666666667, ans=0.0 2023-09-29 08:55:14,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:20,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:55:20,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:20,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:55:22,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:55:22,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:22,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=313306.6666666667, ans=0.2 2023-09-29 08:55:23,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:34,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:36,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:55:38,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:55:40,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:55:41,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:55:48,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=313440.0, ans=0.035 2023-09-29 08:55:49,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:55:52,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=313440.0, ans=0.0 2023-09-29 08:55:53,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:55:59,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:56:02,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:56:02,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 08:56:02,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:04,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:56:08,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:56:09,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 08:56:09,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:56:09,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:16,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:56:16,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:56:17,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:20,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:56:20,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:56:22,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 08:56:25,079 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.19 vs. limit=15.0 2023-09-29 08:56:25,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 08:56:25,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 08:56:30,944 INFO [train.py:1039] (0/4) Epoch 9, batch 4550, loss[loss=0.2079, simple_loss=0.2777, pruned_loss=0.06901, over 23314.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2798, pruned_loss=0.07162, over 4738036.52 frames. ], batch size: 93, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:56:31,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 08:56:34,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 08:56:35,051 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.18 vs. limit=15.0 2023-09-29 08:56:35,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:37,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:38,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:41,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:45,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:56:47,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:48,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:56:50,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:56:50,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:52,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:52,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:59,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:57:00,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 08:57:00,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 08:57:00,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=313706.6666666667, ans=0.1 2023-09-29 08:57:02,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:57:04,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 08:57:06,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 08:57:06,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:10,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 08:57:12,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:57:14,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:16,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:16,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:57:19,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 08:57:22,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:24,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:25,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:26,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:27,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 08:57:29,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 08:57:29,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:57:31,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 08:57:34,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 08:57:34,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:36,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:36,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:57:37,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:37,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:57:39,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:57:39,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 08:57:42,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:42,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 08:57:44,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 08:57:44,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:57:44,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 08:57:47,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:57:47,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:57:51,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:57:52,526 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.108e+02 2.512e+02 3.014e+02 4.343e+02, threshold=5.024e+02, percent-clipped=0.0 2023-09-29 08:57:52,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:52,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:57:54,158 INFO [train.py:1039] (0/4) Epoch 9, batch 4600, loss[loss=0.2153, simple_loss=0.289, pruned_loss=0.07079, over 24386.00 frames. ], tot_loss[loss=0.2099, simple_loss=0.278, pruned_loss=0.07084, over 4726892.96 frames. ], batch size: 77, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:57:54,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:57:55,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:57:57,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:59,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:58:01,379 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=313973.3333333333, ans=0.125 2023-09-29 08:58:02,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:58:02,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:58:04,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:06,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 08:58:07,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:58:11,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:58:11,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:16,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:22,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 08:58:24,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:26,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:29,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:58:29,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:35,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 08:58:35,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:58:37,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:58:41,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:42,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:58:44,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:58:47,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 08:58:49,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:58:50,595 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.01 vs. limit=15.0 2023-09-29 08:58:54,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:55,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:58:57,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:57,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:58:57,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:57,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 08:58:59,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:59,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:01,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:02,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:59:04,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:05,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 08:59:05,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 08:59:05,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 08:59:07,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:09,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:11,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:11,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:17,963 INFO [train.py:1039] (0/4) Epoch 9, batch 4650, loss[loss=0.1975, simple_loss=0.2703, pruned_loss=0.06234, over 24481.00 frames. ], tot_loss[loss=0.2098, simple_loss=0.2781, pruned_loss=0.07073, over 4729862.58 frames. ], batch size: 63, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:59:20,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=314306.6666666667, ans=0.0 2023-09-29 08:59:21,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:59:26,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:26,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:26,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:59:26,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:27,147 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.79 vs. limit=15.0 2023-09-29 08:59:27,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:29,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:29,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=314306.6666666667, ans=0.0 2023-09-29 08:59:31,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 08:59:32,802 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=314373.3333333333, ans=10.0 2023-09-29 08:59:34,449 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=314373.3333333333, ans=0.125 2023-09-29 08:59:36,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:59:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 08:59:37,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:39,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 08:59:39,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:59:39,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 08:59:40,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 08:59:40,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:40,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:59:44,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:59:46,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:46,709 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 08:59:49,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:51,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 08:59:54,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:54,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:59:55,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 08:59:57,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:00:02,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:00:05,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:10,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:13,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:00:17,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 09:00:19,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 09:00:19,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 09:00:19,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 09:00:20,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:22,947 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=314573.3333333333, ans=10.0 2023-09-29 09:00:27,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:00:27,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:27,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 09:00:27,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:30,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:30,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:00:31,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:00:34,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:00:34,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:35,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:38,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:38,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:00:38,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:00:40,262 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 2.149e+02 2.483e+02 2.991e+02 4.624e+02, threshold=4.965e+02, percent-clipped=0.0 2023-09-29 09:00:40,307 INFO [train.py:1039] (0/4) Epoch 9, batch 4700, loss[loss=0.1766, simple_loss=0.2483, pruned_loss=0.0525, over 24470.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.2784, pruned_loss=0.07094, over 4734906.85 frames. ], batch size: 58, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:00:40,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:00:42,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:00:42,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 09:00:50,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:52,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:52,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:00:53,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:57,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:01:00,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=314706.6666666667, ans=0.2 2023-09-29 09:01:01,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 09:01:01,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=314706.6666666667, ans=0.0 2023-09-29 09:01:03,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 09:01:05,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:06,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:01:08,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:01:11,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:14,754 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:01:16,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:01:17,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 09:01:21,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:01:26,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 09:01:28,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:01:30,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:34,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 09:01:35,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:01:39,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=314840.0, ans=0.125 2023-09-29 09:01:41,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:01:41,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 09:01:42,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:42,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:46,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:46,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:01:47,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 09:01:48,049 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 09:01:48,346 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=314906.6666666667, ans=0.125 2023-09-29 09:01:49,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:53,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 09:01:55,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:58,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 09:02:01,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:02:03,561 INFO [train.py:1039] (0/4) Epoch 9, batch 4750, loss[loss=0.1705, simple_loss=0.2495, pruned_loss=0.04577, over 24332.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2791, pruned_loss=0.07066, over 4738957.39 frames. ], batch size: 56, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:02:03,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:08,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:09,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:02:11,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 09:02:11,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:11,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=314973.3333333333, ans=0.125 2023-09-29 09:02:14,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 09:02:16,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:02:16,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:02:17,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:24,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 09:02:28,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:02:28,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=315040.0, ans=0.125 2023-09-29 09:02:30,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 09:02:31,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:35,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=315106.6666666667, ans=0.0 2023-09-29 09:02:36,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:38,212 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 09:02:38,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 09:02:43,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 09:02:46,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:48,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:02:50,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:02:50,182 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 09:02:50,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:02:54,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:02:58,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:03:00,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 09:03:01,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 09:03:01,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:01,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:03:03,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:03,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:03:05,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 09:03:07,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 09:03:10,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:11,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:03:11,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 09:03:11,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:13,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:15,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:03:15,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=315240.0, ans=0.2 2023-09-29 09:03:17,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:17,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:03:20,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:20,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 09:03:21,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 09:03:23,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 09:03:26,102 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.000e+02 2.225e+02 2.502e+02 3.899e+02, threshold=4.449e+02, percent-clipped=0.0 2023-09-29 09:03:26,146 INFO [train.py:1039] (0/4) Epoch 9, batch 4800, loss[loss=0.1826, simple_loss=0.2541, pruned_loss=0.05554, over 24346.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2803, pruned_loss=0.07191, over 4733532.74 frames. ], batch size: 56, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:03:26,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:03:26,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:27,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 09:03:35,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:36,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:42,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:03:43,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:43,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:43,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=315373.3333333333, ans=0.0 2023-09-29 09:03:45,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 09:03:45,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:45,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:03:48,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:03:49,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=315373.3333333333, ans=0.125 2023-09-29 09:03:52,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=315373.3333333333, ans=0.95 2023-09-29 09:03:53,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:54,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=315373.3333333333, ans=0.125 2023-09-29 09:03:56,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:56,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:03:57,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:57,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:03:57,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:59,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:03,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:04,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=315440.0, ans=0.125 2023-09-29 09:04:05,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:04:08,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:04:10,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:10,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 09:04:12,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 09:04:12,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:12,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:04:13,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:04:13,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:13,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:04:15,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:04:17,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:19,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:24,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:25,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:29,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 09:04:29,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:30,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:30,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:04:31,389 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.76 vs. limit=6.0 2023-09-29 09:04:32,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:35,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:36,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:04:36,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:36,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:04:36,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:04:37,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=315573.3333333333, ans=0.125 2023-09-29 09:04:37,735 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.84 vs. limit=15.0 2023-09-29 09:04:38,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:04:41,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=315573.3333333333, ans=0.1 2023-09-29 09:04:42,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:42,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:42,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:44,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 09:04:47,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 09:04:47,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:47,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:49,618 INFO [train.py:1039] (0/4) Epoch 9, batch 4850, loss[loss=0.2114, simple_loss=0.2776, pruned_loss=0.07258, over 18767.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2813, pruned_loss=0.07235, over 4715549.57 frames. ], batch size: 41, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:04:49,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:04:49,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:52,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:05:00,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 09:05:03,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:05,572 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=315706.6666666667, ans=0.0 2023-09-29 09:05:08,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:08,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:05:08,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:13,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:13,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:05:14,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:05:14,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 09:05:20,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:05:22,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:05:22,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:05:22,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:05:22,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 09:05:22,641 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=315773.3333333333, ans=0.2 2023-09-29 09:05:25,353 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.78 vs. limit=10.0 2023-09-29 09:05:26,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:26,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:31,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:31,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 09:05:32,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 09:05:32,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:05:38,215 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.46 vs. limit=10.0 2023-09-29 09:05:40,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:05:41,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 09:05:42,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:05:42,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:05:42,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=315840.0, ans=0.125 2023-09-29 09:05:43,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:05:45,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 09:05:45,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:45,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=315840.0, ans=0.125 2023-09-29 09:05:46,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 09:05:48,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:50,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:05:50,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 09:05:59,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:06:04,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:06:04,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:09,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 09:06:09,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:06:09,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=315906.6666666667, ans=0.125 2023-09-29 09:06:12,309 INFO [train.py:1039] (0/4) Epoch 9, batch 4900, loss[loss=0.2346, simple_loss=0.291, pruned_loss=0.08908, over 23853.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2807, pruned_loss=0.07205, over 4724719.47 frames. ], batch size: 195, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:06:12,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=315973.3333333333, ans=0.1 2023-09-29 09:06:13,851 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.067e+02 2.446e+02 3.189e+02 7.103e+02, threshold=4.893e+02, percent-clipped=2.0 2023-09-29 09:06:14,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:15,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:15,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:06:19,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 09:06:24,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 09:06:28,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 09:06:31,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 09:06:31,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:33,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:33,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:06:33,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:33,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:06:34,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 09:06:36,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 09:06:37,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:06:39,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:06:41,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:44,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:06:44,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:46,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:06:46,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 09:06:47,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:06:49,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:50,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 09:06:50,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 09:06:51,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=316106.6666666667, ans=0.125 2023-09-29 09:06:52,917 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=32.58 vs. limit=15.0 2023-09-29 09:06:54,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 09:06:54,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:06:56,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:06:57,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:06:57,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:57,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:06:57,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:06:59,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 09:07:04,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:06,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:07:06,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=316173.3333333333, ans=0.125 2023-09-29 09:07:07,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:07:10,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 09:07:12,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:07:13,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:07:14,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 09:07:16,689 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=15.06 vs. limit=15.0 2023-09-29 09:07:19,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:19,687 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=316240.0, ans=0.125 2023-09-29 09:07:22,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:07:22,555 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=316240.0, ans=0.1 2023-09-29 09:07:23,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 09:07:23,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:23,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:07:25,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:30,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:30,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:07:30,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:30,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 09:07:32,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:07:34,563 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.43 vs. limit=15.0 2023-09-29 09:07:35,218 INFO [train.py:1039] (0/4) Epoch 9, batch 4950, loss[loss=0.2045, simple_loss=0.2578, pruned_loss=0.07564, over 22854.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2797, pruned_loss=0.07144, over 4735742.27 frames. ], batch size: 322, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:07:37,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:37,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:39,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 09:07:41,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 09:07:41,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:07:42,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 09:07:42,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:42,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:07:44,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:07:44,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:07:47,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:49,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:07:49,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:07:50,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:50,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:52,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:55,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:07:58,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=316373.3333333333, ans=0.04949747468305833 2023-09-29 09:08:00,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:01,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:08:03,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:05,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:07,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:08:07,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 09:08:08,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 09:08:11,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:13,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:08:13,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:08:16,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:08:16,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:08:18,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:08:20,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:25,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:08:26,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:08:28,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:28,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:28,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=316506.6666666667, ans=0.125 2023-09-29 09:08:30,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 09:08:30,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:08:31,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:08:34,091 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-09-29 09:08:34,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:08:35,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:08:35,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:08:36,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:38,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:08:40,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:08:41,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:08:41,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:08:43,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:43,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 09:08:49,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:08:53,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 09:08:53,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:08:54,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=316573.3333333333, ans=0.125 2023-09-29 09:08:57,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=316640.0, ans=0.0 2023-09-29 09:08:59,021 INFO [train.py:1039] (0/4) Epoch 9, batch 5000, loss[loss=0.2004, simple_loss=0.2799, pruned_loss=0.06042, over 24474.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2782, pruned_loss=0.0706, over 4738764.14 frames. ], batch size: 66, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:09:00,623 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.030e+02 2.416e+02 2.948e+02 4.844e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 09:09:00,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:00,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:02,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 09:09:03,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 09:09:05,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:09:07,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 09:09:07,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:09:07,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:09:08,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 09:09:08,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:08,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:09,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=316640.0, ans=0.0 2023-09-29 09:09:10,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 09:09:10,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:10,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:13,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 09:09:13,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 09:09:15,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:09:15,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 09:09:15,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:09:16,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:16,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:09:16,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 09:09:16,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 09:09:20,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 09:09:20,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:20,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:22,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 09:09:22,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:23,151 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=316706.6666666667, ans=0.125 2023-09-29 09:09:24,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:25,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:28,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:09:28,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 09:09:30,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:09:31,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:09:37,793 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 09:09:39,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:41,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:41,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:09:41,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=316773.3333333333, ans=0.2 2023-09-29 09:09:47,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 09:09:47,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:47,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:49,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:50,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 09:09:50,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:54,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:56,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:00,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=316840.0, ans=0.0 2023-09-29 09:10:02,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 09:10:07,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:10,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=316906.6666666667, ans=0.1 2023-09-29 09:10:10,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=316906.6666666667, ans=0.0 2023-09-29 09:10:16,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:10:19,320 INFO [train.py:1039] (0/4) Epoch 9, batch 5050, loss[loss=0.1806, simple_loss=0.2534, pruned_loss=0.05391, over 24579.00 frames. ], tot_loss[loss=0.2098, simple_loss=0.2786, pruned_loss=0.07052, over 4750549.04 frames. ], batch size: 60, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:10:19,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:19,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:10:19,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:19,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:10:19,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:10:19,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:24,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:24,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 09:10:27,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:10:30,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:31,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:10:32,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 09:10:33,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:34,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:10:38,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:10:39,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:10:41,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:10:49,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 09:10:50,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:10:50,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:10:52,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 09:10:52,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:10:53,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:53,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:55,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:10:55,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 09:10:55,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 09:10:56,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:59,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:02,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:11:02,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 09:11:03,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:04,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=317106.6666666667, ans=0.125 2023-09-29 09:11:05,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=317106.6666666667, ans=0.125 2023-09-29 09:11:06,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 09:11:09,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:11:09,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:11:11,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:12,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:11:14,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:11:17,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:11:19,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:19,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:11:19,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:11:20,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 09:11:20,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:11:22,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=317173.3333333333, ans=0.125 2023-09-29 09:11:23,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:11:27,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:27,176 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 09:11:27,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:11:28,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:11:29,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=317240.0, ans=0.125 2023-09-29 09:11:30,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:31,671 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 09:11:32,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=317240.0, ans=0.125 2023-09-29 09:11:35,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:35,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 09:11:35,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:38,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=317240.0, ans=0.125 2023-09-29 09:11:40,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:41,517 INFO [train.py:1039] (0/4) Epoch 9, batch 5100, loss[loss=0.218, simple_loss=0.2727, pruned_loss=0.08172, over 23650.00 frames. ], tot_loss[loss=0.2099, simple_loss=0.279, pruned_loss=0.07039, over 4747861.30 frames. ], batch size: 232, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:11:41,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:41,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 09:11:43,051 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.032e+02 2.319e+02 2.659e+02 3.992e+02, threshold=4.638e+02, percent-clipped=0.0 2023-09-29 09:11:43,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 09:11:45,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:45,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:11:47,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:11:49,306 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 09:11:51,049 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:11:51,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=317306.6666666667, ans=0.0 2023-09-29 09:11:52,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:55,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 09:11:57,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 09:11:57,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:59,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:12:02,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:12:02,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 09:12:02,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 09:12:04,706 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.62 vs. limit=15.0 2023-09-29 09:12:06,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:12:08,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:12:12,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:12:15,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 09:12:15,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:16,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:12:16,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 09:12:19,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 09:12:24,067 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 09:12:24,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:24,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 09:12:24,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 09:12:28,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:37,281 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=317506.6666666667, ans=0.025 2023-09-29 09:12:38,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:12:41,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 09:12:43,103 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 09:12:43,116 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 09:12:45,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 09:12:45,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:47,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 09:12:51,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 09:12:54,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:12:56,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:12:56,489 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=317573.3333333333, ans=0.0 2023-09-29 09:12:59,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 09:13:00,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:13:00,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 09:13:05,221 INFO [train.py:1039] (0/4) Epoch 9, batch 5150, loss[loss=0.2112, simple_loss=0.2858, pruned_loss=0.06835, over 24687.00 frames. ], tot_loss[loss=0.2089, simple_loss=0.2785, pruned_loss=0.06965, over 4758648.23 frames. ], batch size: 73, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:13:07,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:13:07,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:13:07,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:13:08,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:13:08,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:13:10,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:13:10,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 09:13:10,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 09:13:12,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 09:13:13,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:13:13,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 09:13:13,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:15,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:13:17,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:19,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:23,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:13:23,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 09:13:25,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:26,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:13:28,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:13:28,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:28,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:28,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:13:28,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:13:31,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 09:13:32,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:13:32,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:13:34,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:13:37,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 09:13:37,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:13:43,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:13:45,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 09:13:47,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=317773.3333333333, ans=0.2 2023-09-29 09:13:47,522 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:13:48,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:49,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=317773.3333333333, ans=0.125 2023-09-29 09:13:54,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:55,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:57,774 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:14:01,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:01,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:01,469 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=317840.0, ans=0.125 2023-09-29 09:14:04,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 09:14:07,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:14:08,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:14:09,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:14:09,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=317840.0, ans=0.125 2023-09-29 09:14:13,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:15,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:16,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 09:14:16,844 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=317906.6666666667, ans=0.2 2023-09-29 09:14:20,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:14:23,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:14:25,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:14:25,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:14:25,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:14:25,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:14:26,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:14:26,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:14:28,340 INFO [train.py:1039] (0/4) Epoch 9, batch 5200, loss[loss=0.2077, simple_loss=0.2845, pruned_loss=0.06546, over 24499.00 frames. ], tot_loss[loss=0.2104, simple_loss=0.2796, pruned_loss=0.07059, over 4748772.74 frames. ], batch size: 66, lr: 1.12e-02, grad_scale: 16.0 2023-09-29 09:14:29,845 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.093e+02 2.397e+02 2.811e+02 4.237e+02, threshold=4.795e+02, percent-clipped=0.0 2023-09-29 09:14:31,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:14:32,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:14:34,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=317973.3333333333, ans=0.125 2023-09-29 09:14:36,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:40,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 09:14:40,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:14:41,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:44,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:45,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:14:45,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:48,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 09:14:49,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:14:51,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:53,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 09:14:55,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:14:55,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:14:56,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 09:14:56,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 09:14:59,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 09:15:01,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:01,186 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 09:15:01,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:15:02,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:02,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:15:02,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=318106.6666666667, ans=0.125 2023-09-29 09:15:04,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 09:15:05,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:08,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:11,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 09:15:11,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 09:15:12,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 09:15:17,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 09:15:17,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:15:25,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:15:25,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:27,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 09:15:27,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:27,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:15:27,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:29,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:15:30,082 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.73 vs. limit=15.0 2023-09-29 09:15:33,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:33,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:15:38,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:40,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:40,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:41,947 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=318240.0, ans=0.125 2023-09-29 09:15:47,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:47,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 09:15:47,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:47,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=318240.0, ans=0.0 2023-09-29 09:15:48,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:15:49,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:49,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:15:50,512 INFO [train.py:1039] (0/4) Epoch 9, batch 5250, loss[loss=0.236, simple_loss=0.3104, pruned_loss=0.08078, over 24107.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2797, pruned_loss=0.07103, over 4742975.55 frames. ], batch size: 80, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:15:50,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:15:52,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:55,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:55,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:15:57,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:16:04,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:16:04,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:16:07,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:16:08,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:16:10,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 09:16:11,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:16:12,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:16:19,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=318373.3333333333, ans=0.125 2023-09-29 09:16:24,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=318440.0, ans=0.125 2023-09-29 09:16:45,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=318506.6666666667, ans=0.125 2023-09-29 09:16:48,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=318506.6666666667, ans=0.125 2023-09-29 09:17:05,007 INFO [train.py:1039] (0/4) Epoch 9, batch 5300, loss[loss=0.2246, simple_loss=0.2993, pruned_loss=0.07498, over 23392.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2789, pruned_loss=0.07086, over 4729703.90 frames. ], batch size: 93, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:17:05,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=318640.0, ans=10.0 2023-09-29 09:17:07,710 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.994e+02 2.180e+02 2.520e+02 5.243e+02, threshold=4.360e+02, percent-clipped=1.0 2023-09-29 09:17:20,002 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-9.pt 2023-09-29 09:17:27,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:17:27,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 09:17:27,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 09:17:27,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:27,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:27,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:27,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:27,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:27,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:17:28,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:28,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:17:28,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:17:28,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 09:17:28,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 09:17:28,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 09:17:29,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:17:29,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 09:17:29,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 09:17:29,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:30,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:30,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:30,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:30,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:17:31,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:31,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:31,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:31,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:31,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:31,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:17:31,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:31,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:17:32,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 09:17:32,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:33,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:33,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 09:17:33,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 09:17:33,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:17:33,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:17:33,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 09:17:33,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 09:17:33,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:34,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:17:35,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:35,173 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 09:17:35,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 09:17:35,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:17:35,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:35,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 09:17:35,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 09:17:35,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 09:17:36,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:39,044 INFO [train.py:1039] (0/4) Epoch 10, batch 0, loss[loss=0.21, simple_loss=0.2855, pruned_loss=0.0672, over 23905.00 frames. ], tot_loss[loss=0.21, simple_loss=0.2855, pruned_loss=0.0672, over 23905.00 frames. ], batch size: 86, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:17:39,045 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 09:17:52,998 INFO [train.py:1071] (0/4) Epoch 10, validation: loss=0.3048, simple_loss=0.281, pruned_loss=0.1643, over 1125622.00 frames. 2023-09-29 09:17:52,999 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 09:17:56,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 09:17:56,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:17:58,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:18:03,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:03,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:18:05,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:06,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 09:18:08,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 09:18:09,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:09,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:13,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:13,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:13,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:18:13,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:14,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 09:18:17,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:20,607 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=318786.6666666667, ans=0.125 2023-09-29 09:18:22,137 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=318786.6666666667, ans=0.2 2023-09-29 09:18:23,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:18:23,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:25,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 09:18:30,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:18:30,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:18:33,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:36,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:18:37,877 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.83 vs. limit=15.0 2023-09-29 09:18:41,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=318920.0, ans=0.04949747468305833 2023-09-29 09:18:42,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:47,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 09:18:51,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 09:18:53,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:18:53,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:18:53,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:18:53,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:56,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 09:19:00,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:00,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:06,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:08,813 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 09:19:09,438 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.53 vs. limit=10.0 2023-09-29 09:19:10,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:19:12,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:13,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:19:13,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 09:19:15,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:19:15,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:19:16,486 INFO [train.py:1039] (0/4) Epoch 10, batch 50, loss[loss=0.2153, simple_loss=0.2788, pruned_loss=0.07593, over 23842.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2786, pruned_loss=0.07132, over 1069972.96 frames. ], batch size: 195, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:19:18,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:18,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:22,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:23,480 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.56 vs. limit=12.0 2023-09-29 09:19:25,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 09:19:25,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:33,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:19:34,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 09:19:35,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=319120.0, ans=0.09899494936611666 2023-09-29 09:19:37,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 09:19:39,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:19:41,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:19:41,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:43,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:19:43,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=319120.0, ans=0.125 2023-09-29 09:19:43,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=319120.0, ans=0.2 2023-09-29 09:19:44,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:19:44,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:19:44,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:45,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=319120.0, ans=0.125 2023-09-29 09:19:49,770 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=319186.6666666667, ans=0.125 2023-09-29 09:19:52,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:19:54,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:54,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:19:54,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=319186.6666666667, ans=0.125 2023-09-29 09:19:55,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 09:19:57,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:19:57,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:19:57,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 09:19:58,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:00,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 09:20:06,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:07,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:20:09,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:10,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:10,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:12,872 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=319253.3333333333, ans=0.125 2023-09-29 09:20:14,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 09:20:14,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 09:20:16,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:16,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:17,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:20:18,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:18,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 09:20:19,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 09:20:20,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:20:22,295 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.176e+02 2.452e+02 2.821e+02 3.971e+02, threshold=4.904e+02, percent-clipped=0.0 2023-09-29 09:20:22,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:22,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:20:23,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 09:20:24,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 09:20:24,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:25,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:27,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:20:27,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:20:30,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:20:34,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:20:36,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:38,787 INFO [train.py:1039] (0/4) Epoch 10, batch 100, loss[loss=0.2272, simple_loss=0.2893, pruned_loss=0.08252, over 23669.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.2792, pruned_loss=0.07056, over 1889057.01 frames. ], batch size: 256, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:20:38,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 09:20:38,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:45,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:20:45,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:45,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:45,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:45,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:47,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 09:20:49,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:20:49,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:49,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:49,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:55,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 09:20:56,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:57,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:57,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=319453.3333333333, ans=0.95 2023-09-29 09:20:58,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:21:00,331 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=319453.3333333333, ans=0.0 2023-09-29 09:21:01,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:21:04,734 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 09:21:04,771 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 09:21:06,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:06,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:21:09,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:21:12,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:21:14,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:14,613 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=319520.0, ans=0.125 2023-09-29 09:21:21,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:21,319 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 09:21:23,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:21:27,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:21:29,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:21:30,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:33,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:36,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:38,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:21:41,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:43,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:43,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:43,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:21:43,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=319653.3333333333, ans=0.0 2023-09-29 09:21:45,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:45,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 09:21:45,614 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 09:21:47,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:48,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:21:48,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:48,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:48,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 09:21:48,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:21:50,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:21:50,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:51,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:52,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:52,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:21:53,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:21:55,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:59,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:59,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:00,937 INFO [train.py:1039] (0/4) Epoch 10, batch 150, loss[loss=0.2497, simple_loss=0.3004, pruned_loss=0.09951, over 22768.00 frames. ], tot_loss[loss=0.2132, simple_loss=0.2815, pruned_loss=0.07246, over 2507234.59 frames. ], batch size: 322, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:22:01,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:02,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:04,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:04,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=319720.0, ans=0.0 2023-09-29 09:22:07,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:22:08,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:13,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 09:22:13,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 09:22:13,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 09:22:13,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=319720.0, ans=0.2 2023-09-29 09:22:16,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:22:16,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:22:17,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:22:18,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=319786.6666666667, ans=0.125 2023-09-29 09:22:19,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:22:19,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:19,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:19,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:21,636 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 09:22:23,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:27,356 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.318e-02 2023-09-29 09:22:29,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:32,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:22:35,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 09:22:38,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:22:38,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:38,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:22:41,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:22:44,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:44,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:22:46,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:47,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 09:22:53,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:55,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:22:55,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:22:55,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:22:59,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:23:00,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 09:23:02,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:23:04,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:23:05,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:08,105 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.959e+02 2.278e+02 2.639e+02 3.877e+02, threshold=4.556e+02, percent-clipped=0.0 2023-09-29 09:23:08,610 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-48000.pt 2023-09-29 09:23:12,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:23:12,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 09:23:12,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:23:12,851 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 09:23:17,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:22,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:23:23,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:23:25,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 09:23:26,912 INFO [train.py:1039] (0/4) Epoch 10, batch 200, loss[loss=0.2135, simple_loss=0.2855, pruned_loss=0.07076, over 24683.00 frames. ], tot_loss[loss=0.2143, simple_loss=0.2822, pruned_loss=0.07318, over 2997451.51 frames. ], batch size: 73, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:23:27,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:27,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:31,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 09:23:31,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=320053.3333333333, ans=0.125 2023-09-29 09:23:32,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:23:34,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:35,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:23:39,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:23:41,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:41,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:45,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=320120.0, ans=0.125 2023-09-29 09:23:46,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=320120.0, ans=0.125 2023-09-29 09:23:53,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=320120.0, ans=0.04949747468305833 2023-09-29 09:24:00,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:24:02,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:24:03,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:24:05,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:24:05,359 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:24:05,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=320186.6666666667, ans=0.1 2023-09-29 09:24:06,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:24:06,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:24:08,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:08,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:24:09,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:09,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:11,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=320186.6666666667, ans=0.0 2023-09-29 09:24:12,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 09:24:12,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:24:12,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:13,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=320253.3333333333, ans=0.0 2023-09-29 09:24:17,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:24:25,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:25,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=320253.3333333333, ans=0.05 2023-09-29 09:24:30,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:32,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:24:38,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:40,127 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=320320.0, ans=0.125 2023-09-29 09:24:41,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 09:24:42,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:42,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:24:42,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:43,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:24:44,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 09:24:44,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:24:44,682 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 09:24:44,962 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=320320.0, ans=0.125 2023-09-29 09:24:46,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:47,880 INFO [train.py:1039] (0/4) Epoch 10, batch 250, loss[loss=0.2077, simple_loss=0.2937, pruned_loss=0.06091, over 24452.00 frames. ], tot_loss[loss=0.2134, simple_loss=0.2819, pruned_loss=0.07242, over 3379276.61 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:24:50,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:24:51,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:51,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:53,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:24:53,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:57,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:24:59,821 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.45 vs. limit=22.5 2023-09-29 09:25:00,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:06,479 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.45 vs. limit=10.0 2023-09-29 09:25:10,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:10,641 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=320453.3333333333, ans=0.0 2023-09-29 09:25:13,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:25:14,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:25:21,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:25:21,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:25:21,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=320520.0, ans=0.0 2023-09-29 09:25:22,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:25:22,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:24,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:25:24,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:25:26,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:29,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:25:33,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 09:25:33,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:35,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:25:35,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:25:36,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:25:36,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:25:36,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=320586.6666666667, ans=10.0 2023-09-29 09:25:36,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=320586.6666666667, ans=0.125 2023-09-29 09:25:38,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:25:38,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:25:40,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:41,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:25:41,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:25:46,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:25:50,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:53,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:55,084 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.019e+02 2.235e+02 2.582e+02 3.547e+02, threshold=4.469e+02, percent-clipped=0.0 2023-09-29 09:25:59,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:02,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:26:05,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=320653.3333333333, ans=0.125 2023-09-29 09:26:07,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 09:26:07,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:08,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:26:08,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 09:26:10,177 INFO [train.py:1039] (0/4) Epoch 10, batch 300, loss[loss=0.2053, simple_loss=0.2897, pruned_loss=0.06045, over 24344.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2792, pruned_loss=0.07096, over 3677599.52 frames. ], batch size: 74, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:26:10,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:26:11,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:26:11,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 09:26:16,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:26:17,051 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=320720.0, ans=0.025 2023-09-29 09:26:18,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:26:21,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:26:21,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 09:26:23,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:24,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:26:24,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 09:26:24,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:27,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:26:32,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:26:34,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 09:26:37,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=320786.6666666667, ans=0.0 2023-09-29 09:26:38,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 09:26:38,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:41,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:43,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:43,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 09:26:43,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:26:45,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:26:46,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:26:48,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:52,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:26:52,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 09:26:53,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:26:56,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:56,575 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=320853.3333333333, ans=0.125 2023-09-29 09:26:57,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 09:26:57,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:01,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:27:01,788 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.72 vs. limit=15.0 2023-09-29 09:27:04,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:27:04,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 09:27:08,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:08,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:27:11,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:13,187 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=320920.0, ans=0.125 2023-09-29 09:27:14,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:27:14,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 09:27:14,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:27:15,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:17,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 09:27:20,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:21,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:22,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:22,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:22,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:27,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:27,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:27:28,018 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.97 vs. limit=15.0 2023-09-29 09:27:30,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:31,838 INFO [train.py:1039] (0/4) Epoch 10, batch 350, loss[loss=0.213, simple_loss=0.2737, pruned_loss=0.07616, over 23666.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2759, pruned_loss=0.06976, over 3888763.40 frames. ], batch size: 232, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:27:35,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:40,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:41,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:44,806 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=321053.3333333333, ans=15.0 2023-09-29 09:27:45,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 09:27:47,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:47,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 09:27:49,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:50,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 09:27:50,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:55,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 09:27:57,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:27:59,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:28:00,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:28:02,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:02,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:02,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:28:05,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:05,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:07,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=321186.6666666667, ans=0.0 2023-09-29 09:28:12,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:28:12,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:28:15,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:28:15,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:20,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=321253.3333333333, ans=0.2 2023-09-29 09:28:21,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 09:28:21,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:27,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:27,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:28:29,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 09:28:31,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=321253.3333333333, ans=0.125 2023-09-29 09:28:32,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:32,365 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 09:28:35,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 09:28:35,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:39,688 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 2.071e+02 2.540e+02 3.083e+02 5.946e+02, threshold=5.081e+02, percent-clipped=4.0 2023-09-29 09:28:40,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:40,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 09:28:43,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:44,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:28:46,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:47,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:47,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:50,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:53,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:55,215 INFO [train.py:1039] (0/4) Epoch 10, batch 400, loss[loss=0.2038, simple_loss=0.2805, pruned_loss=0.06356, over 23997.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2759, pruned_loss=0.06849, over 4090765.47 frames. ], batch size: 80, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:28:55,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=321386.6666666667, ans=0.125 2023-09-29 09:28:56,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:28:58,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 09:28:58,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:58,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:02,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:29:02,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:05,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:07,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:08,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 09:29:10,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 09:29:10,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:11,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 09:29:11,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:16,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:29:16,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:16,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 09:29:18,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:29:18,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:18,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:19,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:29:22,552 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 09:29:22,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 09:29:25,744 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.92 vs. limit=12.0 2023-09-29 09:29:26,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=321520.0, ans=0.1 2023-09-29 09:29:29,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:30,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:31,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 09:29:33,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 09:29:34,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:29:35,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=321520.0, ans=0.125 2023-09-29 09:29:35,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=321520.0, ans=0.0 2023-09-29 09:29:36,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=321520.0, ans=0.0 2023-09-29 09:29:37,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:29:45,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 09:29:45,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=321586.6666666667, ans=0.09899494936611666 2023-09-29 09:29:48,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:29:49,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 09:29:52,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:54,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:29:55,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 09:29:58,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:30:02,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:30:04,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:30:06,250 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.82 vs. limit=12.0 2023-09-29 09:30:07,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:08,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 09:30:10,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:30:11,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 09:30:15,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:30:15,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:30:15,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=321720.0, ans=0.2 2023-09-29 09:30:16,745 INFO [train.py:1039] (0/4) Epoch 10, batch 450, loss[loss=0.1784, simple_loss=0.2515, pruned_loss=0.0526, over 24406.00 frames. ], tot_loss[loss=0.206, simple_loss=0.276, pruned_loss=0.06801, over 4236765.87 frames. ], batch size: 58, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:30:16,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 09:30:19,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:30:19,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:30:20,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:30:22,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 09:30:22,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:30:23,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:30:25,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:30:25,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 09:30:26,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:30:26,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:30:29,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:30:37,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:37,296 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=321786.6666666667, ans=0.1 2023-09-29 09:30:38,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:30:41,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 09:30:41,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 09:30:45,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:30:48,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:51,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:30:56,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:56,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:59,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 09:30:59,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 09:31:00,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 09:31:02,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:02,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:03,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:31:05,362 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 09:31:05,376 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 09:31:05,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:31:06,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:31:09,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:31:12,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:31:12,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=321920.0, ans=0.125 2023-09-29 09:31:14,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:31:14,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:31:15,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 09:31:17,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:20,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:31:20,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=321986.6666666667, ans=0.125 2023-09-29 09:31:21,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:31:21,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 09:31:22,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=321986.6666666667, ans=0.1 2023-09-29 09:31:25,377 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.898e+02 2.147e+02 2.621e+02 3.477e+02, threshold=4.294e+02, percent-clipped=0.0 2023-09-29 09:31:26,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:31:27,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 09:31:29,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 09:31:30,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:35,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:31:36,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:31:38,104 INFO [train.py:1039] (0/4) Epoch 10, batch 500, loss[loss=0.2047, simple_loss=0.2665, pruned_loss=0.07146, over 23598.00 frames. ], tot_loss[loss=0.2061, simple_loss=0.2764, pruned_loss=0.06791, over 4346548.90 frames. ], batch size: 149, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:31:39,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:31:39,675 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 09:31:43,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:45,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:31:46,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:46,870 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 09:31:48,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 09:31:48,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:51,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:31:53,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=322120.0, ans=0.125 2023-09-29 09:31:54,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:31:57,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:31:59,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=322120.0, ans=0.125 2023-09-29 09:32:00,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:32:00,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:32:02,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:11,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:11,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:32:12,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:32:12,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:12,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 09:32:12,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:32:14,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=322186.6666666667, ans=0.125 2023-09-29 09:32:17,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:32:19,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:32:19,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:32:19,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:21,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 09:32:24,628 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 09:32:26,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:27,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:30,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:32:32,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 09:32:34,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:32:37,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:41,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:32:41,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=322253.3333333333, ans=0.125 2023-09-29 09:32:44,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:44,448 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:32:46,456 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.41 vs. limit=15.0 2023-09-29 09:32:48,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:52,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 09:32:52,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:52,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:56,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 09:32:56,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:32:59,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:00,700 INFO [train.py:1039] (0/4) Epoch 10, batch 550, loss[loss=0.18, simple_loss=0.2551, pruned_loss=0.05248, over 24287.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2776, pruned_loss=0.06925, over 4433287.56 frames. ], batch size: 56, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:33:04,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 09:33:04,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=322386.6666666667, ans=0.125 2023-09-29 09:33:07,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 09:33:07,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:07,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 09:33:09,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:33:09,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:10,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:11,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:12,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:33:14,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:33:15,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:16,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=322453.3333333333, ans=0.125 2023-09-29 09:33:17,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 09:33:17,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:33:20,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:20,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:20,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=322453.3333333333, ans=0.0 2023-09-29 09:33:23,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:25,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:27,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=322453.3333333333, ans=0.0 2023-09-29 09:33:29,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 09:33:31,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 09:33:32,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:33:33,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=322520.0, ans=0.125 2023-09-29 09:33:33,133 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=322520.0, ans=0.1 2023-09-29 09:33:36,604 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.73 vs. limit=22.5 2023-09-29 09:33:37,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:33:37,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:39,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:33:39,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=322520.0, ans=0.0 2023-09-29 09:33:44,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:44,622 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 09:33:46,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:47,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:33:49,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:49,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:33:49,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:33:52,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:52,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 09:33:54,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 09:33:55,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:33:55,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:56,216 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.92 vs. limit=12.0 2023-09-29 09:33:57,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:33:57,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:34:00,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:34:00,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:34:02,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:34:02,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=322586.6666666667, ans=0.125 2023-09-29 09:34:04,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:07,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:34:09,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:34:10,437 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.009e+02 2.272e+02 2.657e+02 5.113e+02, threshold=4.543e+02, percent-clipped=1.0 2023-09-29 09:34:10,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:12,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:34:12,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:14,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:34:14,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:34:21,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 09:34:22,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 09:34:24,160 INFO [train.py:1039] (0/4) Epoch 10, batch 600, loss[loss=0.2333, simple_loss=0.3085, pruned_loss=0.07902, over 23995.00 frames. ], tot_loss[loss=0.2094, simple_loss=0.2784, pruned_loss=0.07021, over 4492398.02 frames. ], batch size: 80, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:34:24,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:34:24,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:34:24,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:29,557 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.63 vs. limit=22.5 2023-09-29 09:34:31,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:34:35,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:34:37,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 09:34:39,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:34:40,178 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.92 vs. limit=15.0 2023-09-29 09:34:40,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:34:42,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:46,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 09:34:46,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:34:52,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 09:34:53,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=322786.6666666667, ans=0.0 2023-09-29 09:34:55,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:34:55,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:55,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:34:57,575 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=322853.3333333333, ans=0.0 2023-09-29 09:35:02,658 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.15 vs. limit=15.0 2023-09-29 09:35:04,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:35:04,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:35:04,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:12,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:35:16,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:16,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:35:16,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:35:17,492 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.24 vs. limit=15.0 2023-09-29 09:35:25,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 09:35:28,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:35:29,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:35:33,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 09:35:33,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:35:35,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=322986.6666666667, ans=0.09899494936611666 2023-09-29 09:35:35,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=322986.6666666667, ans=0.1 2023-09-29 09:35:36,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 09:35:38,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:35:38,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:35:44,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:35:46,800 INFO [train.py:1039] (0/4) Epoch 10, batch 650, loss[loss=0.1847, simple_loss=0.2674, pruned_loss=0.05101, over 24470.00 frames. ], tot_loss[loss=0.209, simple_loss=0.2772, pruned_loss=0.07043, over 4529758.48 frames. ], batch size: 63, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:35:46,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:35:49,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:35:51,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:35:55,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:35:56,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 09:35:56,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:58,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=323053.3333333333, ans=0.125 2023-09-29 09:36:03,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:36:03,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:05,869 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.04 vs. limit=15.0 2023-09-29 09:36:06,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:09,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=323120.0, ans=0.125 2023-09-29 09:36:11,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 09:36:14,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:14,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:19,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:19,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:36:23,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:23,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:23,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:36:24,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:24,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:36:28,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:36:28,097 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 09:36:28,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:28,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:31,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=323186.6666666667, ans=0.07 2023-09-29 09:36:33,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:34,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:34,899 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.90 vs. limit=15.0 2023-09-29 09:36:35,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:36,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:36:36,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 09:36:38,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:36:38,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:36:38,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:36:38,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:40,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:36:41,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 09:36:44,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 09:36:44,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:44,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:44,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:36:44,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:46,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:48,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=323253.3333333333, ans=0.0 2023-09-29 09:36:53,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:55,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:56,669 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.920e+02 2.159e+02 2.398e+02 3.616e+02, threshold=4.317e+02, percent-clipped=0.0 2023-09-29 09:36:56,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:59,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:59,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:37:01,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:37:06,960 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.17 vs. limit=22.5 2023-09-29 09:37:09,365 INFO [train.py:1039] (0/4) Epoch 10, batch 700, loss[loss=0.1974, simple_loss=0.2815, pruned_loss=0.05668, over 24434.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2762, pruned_loss=0.06974, over 4567787.52 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:37:09,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:37:09,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:09,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:10,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:12,110 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:37:14,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 09:37:16,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 09:37:18,449 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.90 vs. limit=15.0 2023-09-29 09:37:19,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 09:37:19,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:20,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:37:22,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 09:37:29,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:31,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:37:32,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:32,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:37:34,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:37:34,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=323453.3333333333, ans=0.1 2023-09-29 09:37:36,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:39,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:37:39,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:37:42,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 09:37:45,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 09:37:49,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:37:50,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:37:52,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:37:57,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:37:57,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 09:38:00,038 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=323586.6666666667, ans=0.0 2023-09-29 09:38:04,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:04,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:38:05,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 09:38:10,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:38:10,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:15,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:38:17,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=323653.3333333333, ans=0.125 2023-09-29 09:38:18,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:38:18,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 09:38:19,721 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.87 vs. limit=12.0 2023-09-29 09:38:22,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 09:38:22,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 09:38:25,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:25,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=323653.3333333333, ans=0.0 2023-09-29 09:38:28,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:30,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:38:32,729 INFO [train.py:1039] (0/4) Epoch 10, batch 750, loss[loss=0.2089, simple_loss=0.2851, pruned_loss=0.06629, over 24677.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2757, pruned_loss=0.06936, over 4592862.72 frames. ], batch size: 65, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:38:32,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:32,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 09:38:37,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 09:38:37,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 09:38:38,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 09:38:38,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 09:38:39,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 09:38:40,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:38:41,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 09:38:42,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:42,399 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=323720.0, ans=0.125 2023-09-29 09:38:43,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:38:43,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:45,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:46,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:38:46,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:47,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=323786.6666666667, ans=0.1 2023-09-29 09:38:50,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:38:50,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:38:50,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=323786.6666666667, ans=0.125 2023-09-29 09:38:53,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:38:55,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:57,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:57,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 09:38:58,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:38:58,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:00,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:03,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:39:05,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 09:39:05,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:07,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 09:39:07,167 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 09:39:07,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 09:39:07,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:39:07,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:39:10,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:39:17,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:39:17,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:17,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:39:21,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:39:21,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:39:22,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 09:39:22,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:39:24,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:39:25,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:39:29,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:39:29,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 09:39:30,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:36,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:39:38,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:39:38,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:39:41,892 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.983e+02 2.343e+02 2.893e+02 4.717e+02, threshold=4.686e+02, percent-clipped=1.0 2023-09-29 09:39:41,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:39:43,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 09:39:43,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:39:43,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:51,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:53,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:39:54,411 INFO [train.py:1039] (0/4) Epoch 10, batch 800, loss[loss=0.2301, simple_loss=0.3053, pruned_loss=0.07744, over 23703.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2761, pruned_loss=0.06923, over 4629762.30 frames. ], batch size: 85, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:39:54,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=324053.3333333333, ans=0.0 2023-09-29 09:39:59,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:59,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:02,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:40:02,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:04,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:04,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:06,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:12,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:12,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:40:16,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 09:40:16,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:16,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=324120.0, ans=0.125 2023-09-29 09:40:19,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:19,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:40:19,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:19,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 09:40:21,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:21,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 09:40:24,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:25,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:27,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:40:27,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:30,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:30,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:35,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:40:35,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:40:35,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:40:37,053 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 09:40:38,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 09:40:38,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:40:38,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:40:39,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:40,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:40:46,217 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 09:40:46,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 09:40:49,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:40:49,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:40:51,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:40:51,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=324253.3333333333, ans=0.0 2023-09-29 09:40:54,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:54,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=324253.3333333333, ans=0.0 2023-09-29 09:40:55,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 09:40:57,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:41:01,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 09:41:09,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:10,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=324320.0, ans=0.5 2023-09-29 09:41:11,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:41:12,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 09:41:13,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:41:14,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:16,491 INFO [train.py:1039] (0/4) Epoch 10, batch 850, loss[loss=0.1997, simple_loss=0.2827, pruned_loss=0.0584, over 24693.00 frames. ], tot_loss[loss=0.2085, simple_loss=0.2776, pruned_loss=0.06964, over 4660124.56 frames. ], batch size: 73, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:41:16,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 09:41:17,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:20,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:41:21,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:23,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:41:25,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:41:25,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=324386.6666666667, ans=0.2 2023-09-29 09:41:26,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 09:41:28,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 09:41:28,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 09:41:28,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:28,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:41:30,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:31,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:31,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:41:31,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=324453.3333333333, ans=0.0 2023-09-29 09:41:37,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:37,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:41:37,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 09:41:42,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 09:41:45,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:47,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 09:41:53,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 09:41:55,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 09:41:57,545 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 09:41:57,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:41:57,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:41:57,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:42:00,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 09:42:05,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:42:05,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:06,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:42:06,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:42:08,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:42:09,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:42:11,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 09:42:14,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:42:14,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:16,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:42:16,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:17,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:19,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:21,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=324653.3333333333, ans=0.125 2023-09-29 09:42:22,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:42:22,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:42:24,597 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 1.918e+02 2.187e+02 2.530e+02 4.309e+02, threshold=4.375e+02, percent-clipped=0.0 2023-09-29 09:42:24,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:26,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:42:29,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=324653.3333333333, ans=0.2 2023-09-29 09:42:33,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:42:35,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:36,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 09:42:36,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:36,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:37,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=324720.0, ans=0.0 2023-09-29 09:42:38,085 INFO [train.py:1039] (0/4) Epoch 10, batch 900, loss[loss=0.2484, simple_loss=0.3026, pruned_loss=0.09707, over 23848.00 frames. ], tot_loss[loss=0.2092, simple_loss=0.2781, pruned_loss=0.0701, over 4682583.68 frames. ], batch size: 164, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:42:41,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 09:42:44,499 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=324720.0, ans=0.05 2023-09-29 09:42:45,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:42:47,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:47,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 09:42:49,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:42:51,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 09:42:52,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:42:54,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:54,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:42:54,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:42:54,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:43:06,131 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.66 vs. limit=6.0 2023-09-29 09:43:08,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:09,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:43:09,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:43:14,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:18,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 09:43:20,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:43:24,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:43:25,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:43:25,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=324920.0, ans=0.125 2023-09-29 09:43:26,982 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 09:43:27,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 09:43:35,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:43:35,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:43:36,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:43:40,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=324920.0, ans=0.2 2023-09-29 09:43:44,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:44,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:43:45,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 09:43:47,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:48,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 09:43:48,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:43:50,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:50,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:43:51,036 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.15 vs. limit=12.0 2023-09-29 09:43:51,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:43:55,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 09:43:56,508 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 09:43:58,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:43:59,822 INFO [train.py:1039] (0/4) Epoch 10, batch 950, loss[loss=0.1999, simple_loss=0.2763, pruned_loss=0.06169, over 24652.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.279, pruned_loss=0.07061, over 4688796.11 frames. ], batch size: 65, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:43:59,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 09:44:02,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:06,283 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.66 vs. limit=15.0 2023-09-29 09:44:08,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 09:44:12,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:15,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:44:18,310 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 09:44:21,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:21,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:22,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:24,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:44:24,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 09:44:24,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:44:27,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:28,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 09:44:29,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:32,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:33,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 09:44:37,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:44:37,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:40,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:44:45,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:44:45,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:48,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=325253.3333333333, ans=0.035 2023-09-29 09:44:49,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 09:44:50,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:44:50,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:44:52,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:44:53,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:53,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:44:57,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 09:44:58,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:45:00,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:01,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:01,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 09:45:03,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:03,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:45:03,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 09:45:08,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:45:09,606 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.117e+02 2.443e+02 3.103e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-29 09:45:09,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:15,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:17,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 09:45:17,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 09:45:22,956 INFO [train.py:1039] (0/4) Epoch 10, batch 1000, loss[loss=0.2222, simple_loss=0.2603, pruned_loss=0.09208, over 19547.00 frames. ], tot_loss[loss=0.2082, simple_loss=0.2768, pruned_loss=0.06986, over 4691080.61 frames. ], batch size: 388, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:45:23,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:27,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 09:45:27,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:45:35,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:45:35,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 09:45:35,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 09:45:39,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:39,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:43,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:46,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 09:45:47,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=325453.3333333333, ans=0.125 2023-09-29 09:45:50,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 09:45:50,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 09:45:50,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=325453.3333333333, ans=0.07 2023-09-29 09:45:51,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:45:53,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 09:45:54,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 09:45:54,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 09:45:57,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:57,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:04,393 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.09 vs. limit=15.0 2023-09-29 09:46:06,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:06,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:46:08,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:08,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:08,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 09:46:09,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:11,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:46:11,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:12,838 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 09:46:14,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 09:46:16,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 09:46:18,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 09:46:21,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:46:24,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=325586.6666666667, ans=0.0 2023-09-29 09:46:28,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:28,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:46:30,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:30,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:46:33,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 09:46:34,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:46:35,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 09:46:35,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 09:46:36,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:46:36,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:38,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:46:40,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=325653.3333333333, ans=0.025 2023-09-29 09:46:40,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=325653.3333333333, ans=0.0 2023-09-29 09:46:41,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:46:42,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:43,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=325720.0, ans=0.025 2023-09-29 09:46:44,477 INFO [train.py:1039] (0/4) Epoch 10, batch 1050, loss[loss=0.2047, simple_loss=0.2405, pruned_loss=0.08445, over 19183.00 frames. ], tot_loss[loss=0.207, simple_loss=0.275, pruned_loss=0.06954, over 4682242.28 frames. ], batch size: 389, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:46:46,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:46:47,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:46:49,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:46:51,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:52,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:46:54,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:46:55,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:46:59,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:46:59,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:46:59,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:47:01,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:47:03,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 09:47:03,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:04,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 09:47:06,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:47:06,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 09:47:06,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:47:15,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:47:15,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:47:15,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:19,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 09:47:20,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 09:47:20,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:47:22,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 09:47:25,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 09:47:26,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:47:31,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:47:34,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:47:34,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:47:35,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:47:37,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=325920.0, ans=0.125 2023-09-29 09:47:39,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.65 vs. limit=15.0 2023-09-29 09:47:40,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:47:43,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 09:47:43,732 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=325920.0, ans=0.05 2023-09-29 09:47:44,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 09:47:45,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 09:47:45,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:46,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:47:48,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 09:47:51,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:47:52,395 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.027e+02 2.343e+02 2.792e+02 3.800e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-29 09:47:54,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:54,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:47:56,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:47:56,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 09:48:04,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:48:04,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 09:48:04,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 09:48:05,647 INFO [train.py:1039] (0/4) Epoch 10, batch 1100, loss[loss=0.2112, simple_loss=0.2954, pruned_loss=0.06352, over 24419.00 frames. ], tot_loss[loss=0.207, simple_loss=0.2753, pruned_loss=0.06937, over 4685634.74 frames. ], batch size: 69, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:48:05,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:48:11,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:14,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:48:17,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:48:18,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:48:19,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:19,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 09:48:20,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:48:22,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:48:26,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:48:29,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:48:29,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 09:48:30,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:48:32,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:32,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:48:35,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:48:37,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=326186.6666666667, ans=0.2 2023-09-29 09:48:38,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:48:44,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:48:47,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 09:48:47,645 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 09:48:47,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:49,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=326186.6666666667, ans=0.125 2023-09-29 09:48:52,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:53,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:48:53,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:55,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 09:48:56,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:48:56,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:48:56,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:48:56,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:58,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 09:49:02,115 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=326253.3333333333, ans=0.2 2023-09-29 09:49:04,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:49:04,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 09:49:08,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:49:11,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:49:13,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 09:49:13,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:49:14,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:15,306 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=326320.0, ans=0.125 2023-09-29 09:49:19,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:19,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:20,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 09:49:21,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:49:21,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:23,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 09:49:23,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:49:23,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 09:49:25,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:49:25,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:49:26,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:49:28,274 INFO [train.py:1039] (0/4) Epoch 10, batch 1150, loss[loss=0.1743, simple_loss=0.2563, pruned_loss=0.04615, over 24291.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2755, pruned_loss=0.06913, over 4700492.09 frames. ], batch size: 61, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:49:31,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:34,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:49:36,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:36,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:49:38,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 09:49:38,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:49:41,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 09:49:43,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:43,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:49:49,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 09:49:51,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:55,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:56,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:49:57,441 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.09 vs. limit=15.0 2023-09-29 09:49:58,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 09:49:58,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:49:58,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:50:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 09:50:05,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:07,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:50:17,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:22,646 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:50:23,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:23,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 09:50:25,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:25,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:31,185 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 09:50:31,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:36,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=326653.3333333333, ans=0.07 2023-09-29 09:50:37,350 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.028e+02 2.366e+02 3.044e+02 5.235e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 09:50:39,074 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 09:50:44,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:45,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:50:45,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:50:45,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:50:49,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:50:50,928 INFO [train.py:1039] (0/4) Epoch 10, batch 1200, loss[loss=0.1952, simple_loss=0.268, pruned_loss=0.06115, over 23265.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2756, pruned_loss=0.06912, over 4704759.14 frames. ], batch size: 93, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:50:53,243 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.14 vs. limit=22.5 2023-09-29 09:50:53,246 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.96 vs. limit=10.0 2023-09-29 09:50:54,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:50:54,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:50:57,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:57,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:57,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:51:00,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:51:02,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:51:04,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:04,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:04,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=326720.0, ans=0.0 2023-09-29 09:51:06,742 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.93 vs. limit=12.0 2023-09-29 09:51:07,437 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 09:51:10,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 09:51:15,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:51:17,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:51:20,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:20,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:51:20,401 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 09:51:23,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:31,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:51:31,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:51:31,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 09:51:33,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:51:34,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 09:51:39,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=326920.0, ans=0.0 2023-09-29 09:51:40,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 09:51:40,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:42,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:43,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:44,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:51:45,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:45,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:51:46,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:51:47,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 09:51:47,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:51:49,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:51:49,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:51:50,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:50,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:55,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:51:58,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:52:02,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 09:52:05,219 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 09:52:08,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:11,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:52:13,021 INFO [train.py:1039] (0/4) Epoch 10, batch 1250, loss[loss=0.1962, simple_loss=0.2799, pruned_loss=0.05626, over 24434.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2769, pruned_loss=0.06963, over 4723290.45 frames. ], batch size: 69, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:52:13,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:52:15,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:52:16,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 09:52:21,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:52:22,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:22,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 09:52:26,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:52:26,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:52:31,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:52:33,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:33,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:52:33,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:36,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:52:40,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:52:40,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:52:41,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:42,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:43,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:45,868 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=327186.6666666667, ans=0.0 2023-09-29 09:52:47,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:52:49,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:52:52,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 09:52:54,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:52:57,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:52:57,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 09:52:57,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:59,592 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 09:52:59,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:59,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:04,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:06,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:07,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:53:09,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 09:53:09,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 09:53:09,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 09:53:09,874 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.57 vs. limit=15.0 2023-09-29 09:53:10,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:12,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 09:53:12,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:16,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:53:16,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:53:17,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 09:53:18,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:53:19,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:53:19,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:53:20,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:53:21,757 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 2.035e+02 2.245e+02 2.590e+02 3.760e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 09:53:21,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 09:53:26,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:28,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:53:30,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:53:35,031 INFO [train.py:1039] (0/4) Epoch 10, batch 1300, loss[loss=0.1958, simple_loss=0.2792, pruned_loss=0.05622, over 24299.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2761, pruned_loss=0.06877, over 4723535.89 frames. ], batch size: 74, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:53:35,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:53:37,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:38,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 09:53:39,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=327386.6666666667, ans=0.0 2023-09-29 09:53:43,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:45,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:53:47,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:53:49,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:49,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:53:49,979 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:53:51,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 09:53:55,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:53:56,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:53:57,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 09:54:01,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:54:04,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:04,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:06,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:54:08,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:08,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:54:10,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:54:10,703 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=327520.0, ans=0.1 2023-09-29 09:54:11,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 09:54:17,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:54:18,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:54:19,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 09:54:21,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:54:24,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:54:26,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:54:26,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 09:54:27,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:27,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 09:54:29,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:33,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:33,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:54:36,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 09:54:38,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 09:54:40,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 09:54:47,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:54:49,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 09:54:49,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=327653.3333333333, ans=0.0 2023-09-29 09:54:51,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:56,842 INFO [train.py:1039] (0/4) Epoch 10, batch 1350, loss[loss=0.1958, simple_loss=0.2695, pruned_loss=0.06111, over 24551.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.2754, pruned_loss=0.06778, over 4736634.56 frames. ], batch size: 60, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:54:58,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 09:55:03,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:05,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:07,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:55:07,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:10,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:55:10,440 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=327720.0, ans=0.0 2023-09-29 09:55:11,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:15,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:17,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=327786.6666666667, ans=0.0 2023-09-29 09:55:19,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 09:55:19,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:55:20,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:55:22,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 09:55:24,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:55:25,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:55:25,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 09:55:27,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 09:55:28,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 09:55:31,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:32,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 09:55:40,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=327853.3333333333, ans=0.1 2023-09-29 09:55:43,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:53,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:55,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:55:55,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 09:55:58,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:56:00,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 09:56:00,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:56:00,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:56:03,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:56:06,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 09:56:06,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:56:09,400 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.130e+02 2.395e+02 2.953e+02 6.223e+02, threshold=4.790e+02, percent-clipped=2.0 2023-09-29 09:56:12,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 09:56:14,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 09:56:20,586 INFO [train.py:1039] (0/4) Epoch 10, batch 1400, loss[loss=0.2025, simple_loss=0.2344, pruned_loss=0.08529, over 19194.00 frames. ], tot_loss[loss=0.2047, simple_loss=0.2742, pruned_loss=0.06762, over 4725335.43 frames. ], batch size: 389, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:56:20,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 09:56:23,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:56:25,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:56:26,165 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=328053.3333333333, ans=0.07 2023-09-29 09:56:27,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:56:34,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 09:56:35,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 09:56:45,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:56:47,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:56:48,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:56:50,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:56:55,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:56:56,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:56:57,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=328186.6666666667, ans=0.2 2023-09-29 09:57:02,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=328186.6666666667, ans=0.0 2023-09-29 09:57:07,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:07,902 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.66 vs. limit=15.0 2023-09-29 09:57:08,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:11,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 09:57:11,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:57:12,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:57:12,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=328253.3333333333, ans=15.0 2023-09-29 09:57:14,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:57:14,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:15,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:57:15,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:57:15,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:57:17,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 09:57:17,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:57:21,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:25,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:57:31,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 09:57:33,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:57:35,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:57:38,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:57:38,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:41,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:57:43,353 INFO [train.py:1039] (0/4) Epoch 10, batch 1450, loss[loss=0.1839, simple_loss=0.2542, pruned_loss=0.05678, over 24333.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2733, pruned_loss=0.06675, over 4729073.56 frames. ], batch size: 56, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:57:45,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:57:47,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:57:47,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:47,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:57:50,776 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=328386.6666666667, ans=0.1 2023-09-29 09:57:52,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:53,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:57:53,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:55,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 09:57:55,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=328386.6666666667, ans=0.1 2023-09-29 09:57:56,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:57:57,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 09:57:58,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:00,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:00,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 09:58:01,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:01,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:58:01,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:58:01,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:03,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:58:06,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:10,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:11,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:58:11,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:58:14,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:58:14,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:19,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:58:19,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:22,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 09:58:24,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:27,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.10 vs. limit=15.0 2023-09-29 09:58:29,246 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 09:58:30,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:32,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:58:33,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:35,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 09:58:41,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:41,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 09:58:43,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 09:58:45,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:47,077 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=328586.6666666667, ans=0.09899494936611666 2023-09-29 09:58:49,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:58:49,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:53,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 09:58:54,753 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.987e+02 2.423e+02 2.995e+02 4.591e+02, threshold=4.846e+02, percent-clipped=0.0 2023-09-29 09:58:54,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 09:58:54,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 09:58:57,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:57,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:58:58,831 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=328653.3333333333, ans=0.0 2023-09-29 09:59:06,761 INFO [train.py:1039] (0/4) Epoch 10, batch 1500, loss[loss=0.2261, simple_loss=0.2913, pruned_loss=0.08049, over 23431.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.2741, pruned_loss=0.06702, over 4730557.69 frames. ], batch size: 105, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:59:10,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 09:59:11,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:59:11,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:59:11,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:12,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:14,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:59:14,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 09:59:16,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:59:16,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:59:16,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:17,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:59:19,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:59:21,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:24,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=328786.6666666667, ans=0.2 2023-09-29 09:59:26,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:26,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 09:59:27,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:59:29,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:59:30,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:33,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 09:59:33,334 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=328786.6666666667, ans=0.125 2023-09-29 09:59:39,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 09:59:39,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:41,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 09:59:41,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=328853.3333333333, ans=0.125 2023-09-29 09:59:42,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:59:44,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:59:47,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:47,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:59:48,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 09:59:50,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:59:50,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:50,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 09:59:50,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:57,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:59:57,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 10:00:03,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:00:05,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:00:10,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=328986.6666666667, ans=0.125 2023-09-29 10:00:12,146 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 10:00:14,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:14,081 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 10:00:14,469 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=328986.6666666667, ans=0.2 2023-09-29 10:00:15,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:17,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:17,262 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 10:00:18,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:00:19,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=328986.6666666667, ans=0.0 2023-09-29 10:00:21,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 10:00:23,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:23,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=328986.6666666667, ans=0.125 2023-09-29 10:00:27,772 INFO [train.py:1039] (0/4) Epoch 10, batch 1550, loss[loss=0.2201, simple_loss=0.2776, pruned_loss=0.08136, over 23822.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2755, pruned_loss=0.06767, over 4732625.57 frames. ], batch size: 164, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:00:27,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:27,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:27,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:28,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:29,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:00:31,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 10:00:31,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 10:00:31,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:00:32,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 10:00:33,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 10:00:35,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:36,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:36,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:00:36,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:00:38,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:40,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:40,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=329053.3333333333, ans=0.125 2023-09-29 10:00:44,146 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 10:00:44,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:44,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:00:44,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:00:47,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:00:47,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 10:00:47,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=329120.0, ans=0.1 2023-09-29 10:00:49,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:49,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 10:00:51,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 10:00:51,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 10:00:51,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:52,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:00:55,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:58,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 10:00:58,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 10:01:01,899 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=329186.6666666667, ans=10.0 2023-09-29 10:01:05,946 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.60 vs. limit=6.0 2023-09-29 10:01:08,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:08,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=329186.6666666667, ans=0.125 2023-09-29 10:01:12,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:01:12,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:01:13,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:01:15,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 10:01:20,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:01:22,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:24,067 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=329253.3333333333, ans=0.035 2023-09-29 10:01:25,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:01:27,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=329253.3333333333, ans=0.95 2023-09-29 10:01:28,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:01:28,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:28,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 10:01:29,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:31,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:01:32,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:34,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:01:34,336 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 10:01:37,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:01:39,016 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.945e+02 2.181e+02 2.494e+02 3.678e+02, threshold=4.362e+02, percent-clipped=0.0 2023-09-29 10:01:44,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 10:01:49,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:50,488 INFO [train.py:1039] (0/4) Epoch 10, batch 1600, loss[loss=0.2487, simple_loss=0.2999, pruned_loss=0.09879, over 23366.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2767, pruned_loss=0.06872, over 4723550.04 frames. ], batch size: 285, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:01:50,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:50,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=329386.6666666667, ans=0.125 2023-09-29 10:01:52,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 10:01:54,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:56,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:56,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:01:56,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:01:57,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:02:01,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:01,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=329386.6666666667, ans=0.125 2023-09-29 10:02:02,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 10:02:03,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 10:02:06,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 10:02:09,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:09,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 10:02:10,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:02:12,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:02:19,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:02:22,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 10:02:25,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:02:25,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 10:02:25,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:26,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 10:02:33,253 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.63 vs. limit=15.0 2023-09-29 10:02:34,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 10:02:42,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:42,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 10:02:43,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:44,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:44,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:02:44,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=329586.6666666667, ans=0.0 2023-09-29 10:02:45,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:02:50,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:02:53,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:02:53,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:53,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:55,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:02:55,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=329653.3333333333, ans=0.1 2023-09-29 10:02:57,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:02:57,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:02:58,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:03:04,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:06,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:03:07,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 10:03:07,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:03:09,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 10:03:11,638 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.16 vs. limit=15.0 2023-09-29 10:03:13,619 INFO [train.py:1039] (0/4) Epoch 10, batch 1650, loss[loss=0.1962, simple_loss=0.275, pruned_loss=0.05866, over 24493.00 frames. ], tot_loss[loss=0.2098, simple_loss=0.2791, pruned_loss=0.07029, over 4710682.44 frames. ], batch size: 63, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:03:13,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:15,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:03:15,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:03:15,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 10:03:15,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 10:03:15,817 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=329720.0, ans=0.2 2023-09-29 10:03:17,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 10:03:17,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 10:03:20,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:21,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:21,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:03:21,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:03:24,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:25,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 10:03:27,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=329720.0, ans=0.0 2023-09-29 10:03:28,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=329786.6666666667, ans=0.125 2023-09-29 10:03:30,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:03:30,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:30,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:03:30,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:03:31,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 10:03:31,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 10:03:40,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:03:43,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:03:49,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 10:03:50,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=329853.3333333333, ans=0.0 2023-09-29 10:03:51,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:03:54,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 10:03:55,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:03:56,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=329853.3333333333, ans=0.0 2023-09-29 10:04:00,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:04:00,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:04:01,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:02,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:04:03,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:06,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:08,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:09,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:11,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:11,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:04:13,776 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=329920.0, ans=0.2 2023-09-29 10:04:16,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:17,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 10:04:20,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:20,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 10:04:20,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 10:04:21,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 10:04:22,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:22,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:04:22,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:23,874 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.033e+02 2.438e+02 2.787e+02 4.126e+02, threshold=4.877e+02, percent-clipped=0.0 2023-09-29 10:04:24,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:24,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 10:04:28,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:30,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:04:30,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:33,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 10:04:35,203 INFO [train.py:1039] (0/4) Epoch 10, batch 1700, loss[loss=0.2292, simple_loss=0.2814, pruned_loss=0.08851, over 23748.00 frames. ], tot_loss[loss=0.2104, simple_loss=0.2793, pruned_loss=0.0708, over 4705578.78 frames. ], batch size: 164, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:04:37,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:37,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:04:37,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 10:04:38,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:04:38,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:04:38,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:41,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:04:42,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:04:42,619 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=330053.3333333333, ans=0.0 2023-09-29 10:04:43,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 10:04:46,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:04:55,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:56,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:04:58,561 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=330120.0, ans=0.0 2023-09-29 10:05:00,681 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.50 vs. limit=15.0 2023-09-29 10:05:01,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:05:01,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:01,800 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:05:03,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:05:03,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:03,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=330120.0, ans=0.2 2023-09-29 10:05:08,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 10:05:09,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:05:09,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:11,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:05:12,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:05:16,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 10:05:16,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 10:05:18,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:19,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 10:05:22,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:05:28,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=330253.3333333333, ans=0.0 2023-09-29 10:05:31,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:32,460 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.91 vs. limit=15.0 2023-09-29 10:05:32,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:33,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:35,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:05:35,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 10:05:35,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:37,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:37,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 10:05:37,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:05:37,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:39,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:39,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:05:41,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:41,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:05:41,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:43,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:05:44,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:48,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:50,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 10:05:51,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:51,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=330320.0, ans=0.2 2023-09-29 10:05:52,927 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.87 vs. limit=8.0 2023-09-29 10:05:53,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:54,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 10:05:58,419 INFO [train.py:1039] (0/4) Epoch 10, batch 1750, loss[loss=0.1833, simple_loss=0.2538, pruned_loss=0.05643, over 24478.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2765, pruned_loss=0.06988, over 4692485.81 frames. ], batch size: 58, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:06:01,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:04,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:04,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:06:06,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 10:06:06,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:06:09,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:06:09,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:14,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 10:06:16,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:16,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=330453.3333333333, ans=0.125 2023-09-29 10:06:17,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 10:06:17,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:19,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:06:23,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:06:25,561 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=330453.3333333333, ans=0.0 2023-09-29 10:06:26,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 10:06:27,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=330453.3333333333, ans=0.0 2023-09-29 10:06:28,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:06:28,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 10:06:35,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:06:38,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:06:38,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:41,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:41,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:44,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:06:47,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:49,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:50,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:53,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 10:06:54,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:57,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 10:06:59,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:01,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:01,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:07:05,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:07:05,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 10:07:07,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:10,060 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.086e+02 2.387e+02 2.992e+02 5.082e+02, threshold=4.774e+02, percent-clipped=1.0 2023-09-29 10:07:10,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:13,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:16,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:17,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:07:20,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 10:07:20,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:21,638 INFO [train.py:1039] (0/4) Epoch 10, batch 1800, loss[loss=0.214, simple_loss=0.277, pruned_loss=0.0755, over 23790.00 frames. ], tot_loss[loss=0.2075, simple_loss=0.2757, pruned_loss=0.06967, over 4691436.53 frames. ], batch size: 212, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:07:21,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:07:21,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:21,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:07:21,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:07:21,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:07:24,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:07:26,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:28,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:07:31,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:35,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:07:36,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:07:37,545 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.84 vs. limit=15.0 2023-09-29 10:07:40,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:07:43,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:43,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:44,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:07:45,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=330786.6666666667, ans=0.1 2023-09-29 10:07:48,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:48,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 10:07:49,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:52,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:54,643 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=330853.3333333333, ans=0.1 2023-09-29 10:07:56,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 10:07:58,617 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.82 vs. limit=15.0 2023-09-29 10:07:59,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 10:08:01,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 10:08:01,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:01,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:08:01,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:02,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:08:10,146 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 10:08:11,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:08:13,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:14,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 10:08:16,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 10:08:16,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:08:18,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:08:19,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:08:23,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 10:08:31,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:08:31,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 10:08:32,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:08:32,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:32,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:08:34,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 10:08:35,156 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.90 vs. limit=15.0 2023-09-29 10:08:37,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:08:37,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:08:40,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 10:08:40,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:42,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:42,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:08:42,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:44,179 INFO [train.py:1039] (0/4) Epoch 10, batch 1850, loss[loss=0.2131, simple_loss=0.293, pruned_loss=0.06657, over 24336.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2753, pruned_loss=0.06927, over 4703399.66 frames. ], batch size: 74, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:08:44,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:45,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:08:47,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:48,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:52,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:08:52,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:08:58,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:09:00,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 10:09:01,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=331120.0, ans=0.125 2023-09-29 10:09:05,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 10:09:08,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 10:09:11,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:11,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 10:09:11,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 10:09:18,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=331186.6666666667, ans=0.0 2023-09-29 10:09:21,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:09:22,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 10:09:26,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:09:26,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:09:31,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 10:09:31,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:31,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:09:34,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:09:34,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:09:37,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:09:42,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:09:43,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:43,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:09:43,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:45,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:47,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:09:51,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 10:09:52,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:55,860 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.969e+02 2.199e+02 2.550e+02 3.875e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 10:09:57,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:09:59,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:09:59,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 10:09:59,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 10:10:01,250 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 10:10:02,703 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 10:10:04,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:10:04,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:10:04,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:04,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=331386.6666666667, ans=0.0 2023-09-29 10:10:05,654 INFO [train.py:1039] (0/4) Epoch 10, batch 1900, loss[loss=0.2089, simple_loss=0.2826, pruned_loss=0.06755, over 24677.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2763, pruned_loss=0.06954, over 4702527.98 frames. ], batch size: 65, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:10:05,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:05,839 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 10:10:05,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:10:07,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:07,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:10:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:10:09,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:10:10,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 10:10:12,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:12,126 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 10:10:12,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:10:12,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:17,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:17,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=331386.6666666667, ans=0.125 2023-09-29 10:10:21,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:10:21,818 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 10:10:23,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 10:10:23,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:25,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:10:25,524 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 10:10:25,586 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 10:10:28,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 10:10:28,948 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=331453.3333333333, ans=0.125 2023-09-29 10:10:32,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:10:35,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 10:10:35,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 10:10:46,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 10:10:48,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=331520.0, ans=0.125 2023-09-29 10:10:49,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 10:10:49,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:50,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=331520.0, ans=0.125 2023-09-29 10:10:51,247 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 10:10:51,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 10:10:51,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=331520.0, ans=0.125 2023-09-29 10:10:52,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 10:10:52,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 10:10:52,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:10:57,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 10:11:00,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:11:02,239 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=15.0 2023-09-29 10:11:04,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:04,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 10:11:06,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:11:10,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 10:11:11,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:17,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:11:17,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:11:17,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:11:17,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:11:19,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:11:19,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:11:21,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:11:22,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:22,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:11:26,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=331720.0, ans=0.1 2023-09-29 10:11:27,165 INFO [train.py:1039] (0/4) Epoch 10, batch 1950, loss[loss=0.1892, simple_loss=0.2739, pruned_loss=0.05226, over 24449.00 frames. ], tot_loss[loss=0.2085, simple_loss=0.2774, pruned_loss=0.0698, over 4708589.55 frames. ], batch size: 69, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:11:27,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:11:27,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:27,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:28,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:29,003 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=331720.0, ans=0.2 2023-09-29 10:11:31,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:35,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:11:36,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:36,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:11:37,283 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=331720.0, ans=0.125 2023-09-29 10:11:40,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 10:11:41,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:11:41,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:43,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:45,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:11:46,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:11:46,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:49,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:11:49,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=331786.6666666667, ans=0.0 2023-09-29 10:11:53,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:53,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:11:53,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:11:53,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:55,180 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=331786.6666666667, ans=0.1 2023-09-29 10:11:58,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:01,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:12:01,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:01,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:12:01,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 10:12:01,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:12:01,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:12:02,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:05,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:09,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:12:14,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:12:18,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:12:18,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:12:18,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 10:12:19,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:20,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=331920.0, ans=0.125 2023-09-29 10:12:24,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:12:25,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:12:27,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:31,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=331986.6666666667, ans=0.05 2023-09-29 10:12:34,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:35,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:37,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:38,988 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.098e+02 2.334e+02 2.724e+02 3.808e+02, threshold=4.669e+02, percent-clipped=0.0 2023-09-29 10:12:40,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:42,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:12:44,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:45,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 10:12:45,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:12:45,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:47,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 10:12:48,769 INFO [train.py:1039] (0/4) Epoch 10, batch 2000, loss[loss=0.1852, simple_loss=0.2611, pruned_loss=0.05465, over 24488.00 frames. ], tot_loss[loss=0.2096, simple_loss=0.2784, pruned_loss=0.07042, over 4698549.10 frames. ], batch size: 63, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:12:48,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:12:53,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:55,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:12:55,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:58,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:13:00,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:03,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 10:13:05,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:13:06,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:13:08,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 10:13:10,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:13:10,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:13:13,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:13:14,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 10:13:14,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:18,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 10:13:18,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:13:20,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 10:13:21,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:26,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:13:27,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:13:27,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:27,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:28,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:29,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 10:13:32,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 10:13:32,732 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=332186.6666666667, ans=0.0 2023-09-29 10:13:34,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:34,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:38,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=332253.3333333333, ans=0.2 2023-09-29 10:13:39,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:39,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:13:39,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:40,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:13:42,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:42,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:44,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:44,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:45,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:48,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:50,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 10:13:54,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:13:55,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:58,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:58,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:14:01,683 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.15 vs. limit=15.0 2023-09-29 10:14:04,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:07,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:07,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:09,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:14:09,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:14:10,522 INFO [train.py:1039] (0/4) Epoch 10, batch 2050, loss[loss=0.2317, simple_loss=0.2934, pruned_loss=0.08506, over 23583.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2785, pruned_loss=0.07039, over 4692080.13 frames. ], batch size: 120, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:14:12,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:12,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:15,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:15,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:21,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:14:24,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:14:25,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:25,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:14:28,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 10:14:28,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:14:29,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=332453.3333333333, ans=0.0 2023-09-29 10:14:31,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:14:31,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:14:35,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=332453.3333333333, ans=0.0 2023-09-29 10:14:35,951 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.11 vs. limit=15.0 2023-09-29 10:14:40,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:40,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:42,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 10:14:45,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:47,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 10:14:47,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:48,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=332520.0, ans=0.02 2023-09-29 10:14:50,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:51,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:14:53,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:14:53,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:55,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:14:57,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:14:57,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:15:00,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:01,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=332586.6666666667, ans=0.125 2023-09-29 10:15:02,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:15:04,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:15:06,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:10,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:16,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:15:16,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 10:15:22,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:22,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:15:24,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:15:26,044 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.999e+02 2.333e+02 2.741e+02 4.462e+02, threshold=4.667e+02, percent-clipped=0.0 2023-09-29 10:15:27,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 10:15:30,827 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 10:15:30,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:30,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:32,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:34,241 INFO [train.py:1039] (0/4) Epoch 10, batch 2100, loss[loss=0.2026, simple_loss=0.256, pruned_loss=0.07457, over 23456.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2765, pruned_loss=0.06982, over 4685910.83 frames. ], batch size: 285, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:15:34,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:34,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 10:15:34,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 10:15:37,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:39,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:15:41,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:15:41,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:42,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:15:42,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 10:15:44,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:15:45,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 10:15:45,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 10:15:49,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:15:49,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:15:49,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 10:15:49,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 10:15:55,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 10:15:55,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:58,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:59,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:16:03,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:16:03,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 10:16:03,979 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=332786.6666666667, ans=0.07 2023-09-29 10:16:05,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:05,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 10:16:06,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 10:16:06,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:09,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 10:16:09,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 10:16:09,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 10:16:12,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:16:14,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:16:17,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:17,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=332853.3333333333, ans=0.125 2023-09-29 10:16:20,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:22,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 10:16:24,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:24,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 10:16:27,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 10:16:27,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=332920.0, ans=0.125 2023-09-29 10:16:28,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 10:16:31,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=332920.0, ans=0.1 2023-09-29 10:16:33,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:16:34,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:16:36,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 10:16:42,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:45,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:16:45,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:16:45,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:16:45,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:16:46,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:16:47,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:47,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:16:48,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:16:48,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:50,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 10:16:51,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 10:16:51,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:16:54,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:54,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:16:54,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:16:56,149 INFO [train.py:1039] (0/4) Epoch 10, batch 2150, loss[loss=0.2216, simple_loss=0.3007, pruned_loss=0.07128, over 24320.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2751, pruned_loss=0.06958, over 4683105.60 frames. ], batch size: 74, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:16:56,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:17:03,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:17:04,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:06,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:07,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:17:07,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:07,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:17:09,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:11,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:17:11,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:17:15,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:15,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 10:17:17,392 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-09-29 10:17:21,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:22,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:17:24,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:24,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:24,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:25,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:17:25,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:25,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:17:27,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:17:28,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 10:17:30,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:17:31,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:31,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:34,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:17:35,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:17:37,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:38,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:17:40,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:40,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 10:17:40,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:17:40,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=333186.6666666667, ans=0.125 2023-09-29 10:17:43,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:44,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:47,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:48,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:17:50,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:50,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:50,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 10:17:52,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 10:17:53,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:17:53,797 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 10:17:53,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:53,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:17:55,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 10:17:55,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:17:55,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 10:17:56,949 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 10:17:56,949 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 10:17:57,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 10:17:58,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:59,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:59,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:18:01,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:04,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:18:04,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:04,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:06,768 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=333320.0, ans=0.2 2023-09-29 10:18:10,124 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.840e+02 1.996e+02 2.223e+02 3.215e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-29 10:18:13,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:18:13,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 10:18:18,070 INFO [train.py:1039] (0/4) Epoch 10, batch 2200, loss[loss=0.2295, simple_loss=0.3055, pruned_loss=0.07672, over 24083.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2759, pruned_loss=0.06951, over 4691959.22 frames. ], batch size: 80, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:18:18,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:18:21,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=333386.6666666667, ans=0.125 2023-09-29 10:18:23,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:25,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:18:25,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:18:26,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:18:29,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:31,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:18:31,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 10:18:37,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 10:18:40,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:18:40,893 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:18:45,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 10:18:48,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:49,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:18:50,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:18:51,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=333520.0, ans=0.125 2023-09-29 10:18:52,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:18:52,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 10:18:57,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:18:59,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:59,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 10:19:04,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:19:04,280 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=333520.0, ans=0.125 2023-09-29 10:19:05,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:07,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:19:08,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:11,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 10:19:11,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:11,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=333586.6666666667, ans=0.0 2023-09-29 10:19:13,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 10:19:14,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:14,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:19:16,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:18,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:19:20,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:20,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:20,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:21,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:19:21,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:19:23,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:19:26,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:19:26,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:19:30,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:19:30,234 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 10:19:34,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:19:34,515 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 10:19:36,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:19:36,121 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 10:19:37,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:39,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:19:40,607 INFO [train.py:1039] (0/4) Epoch 10, batch 2250, loss[loss=0.2769, simple_loss=0.3257, pruned_loss=0.1141, over 19479.00 frames. ], tot_loss[loss=0.2079, simple_loss=0.2765, pruned_loss=0.06965, over 4695806.84 frames. ], batch size: 388, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:19:40,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:42,266 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 10:19:43,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:19:45,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:19:52,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:19:53,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:19:58,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:19:58,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:19:58,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=333786.6666666667, ans=0.125 2023-09-29 10:19:59,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:20:01,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 10:20:01,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:02,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:20:06,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 10:20:07,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:20:07,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:09,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:20:12,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:13,357 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.48 vs. limit=15.0 2023-09-29 10:20:14,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:20:14,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:20:17,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 10:20:17,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=333853.3333333333, ans=0.125 2023-09-29 10:20:18,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:20,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:20:25,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:27,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:28,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:20:28,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:31,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:34,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:20:38,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:20:40,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:20:45,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:20:45,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:20:46,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:20:51,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:20:54,189 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.009e+02 2.238e+02 2.511e+02 3.719e+02, threshold=4.476e+02, percent-clipped=0.0 2023-09-29 10:20:54,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:20:54,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 10:20:54,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:20:55,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:20:59,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 10:21:00,474 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.01 vs. limit=15.0 2023-09-29 10:21:02,417 INFO [train.py:1039] (0/4) Epoch 10, batch 2300, loss[loss=0.2113, simple_loss=0.2786, pruned_loss=0.07203, over 23407.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2765, pruned_loss=0.06896, over 4713109.78 frames. ], batch size: 119, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:21:02,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:21:02,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:08,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:08,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:21:12,936 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 10:21:13,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=334053.3333333333, ans=0.2 2023-09-29 10:21:15,394 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.61 vs. limit=15.0 2023-09-29 10:21:16,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:19,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=334120.0, ans=0.0 2023-09-29 10:21:25,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:21:25,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:21:25,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:21:26,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:26,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 10:21:28,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:21:28,617 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=334120.0, ans=0.125 2023-09-29 10:21:30,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=334120.0, ans=0.125 2023-09-29 10:21:31,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:31,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:21:34,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:21:36,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:21:39,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:21:46,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:21:46,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:50,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:21:53,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:54,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:56,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:21:56,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:21:56,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 10:22:00,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:22:00,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:01,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:02,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:22:02,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:04,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:22:04,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:22:05,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 10:22:05,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:22:05,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:05,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 10:22:12,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:22:15,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:22:22,023 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.48 vs. limit=6.0 2023-09-29 10:22:22,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:22,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:22:22,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:22:24,211 INFO [train.py:1039] (0/4) Epoch 10, batch 2350, loss[loss=0.188, simple_loss=0.2618, pruned_loss=0.05707, over 24319.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2773, pruned_loss=0.06949, over 4707128.22 frames. ], batch size: 61, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:22:24,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:22:24,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:22:24,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:22:25,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 10:22:32,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:22:32,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 10:22:38,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 10:22:41,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:44,115 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.45 vs. limit=10.0 2023-09-29 10:22:46,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:22:47,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:22:48,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 10:22:52,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:22:52,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=334453.3333333333, ans=0.1 2023-09-29 10:22:57,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 10:23:00,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:23:00,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=334520.0, ans=0.05 2023-09-29 10:23:01,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:23:01,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:23:04,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:23:06,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 10:23:06,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:23:08,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:23:08,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:08,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:23:13,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:23:15,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 10:23:15,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:23:17,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:23:17,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:23:18,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=334586.6666666667, ans=0.2 2023-09-29 10:23:20,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 10:23:22,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:23:25,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 10:23:25,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:23:30,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 10:23:35,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 10:23:36,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:36,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 10:23:36,789 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 10:23:36,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 10:23:38,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 10:23:39,751 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 2.129e+02 2.355e+02 2.700e+02 4.237e+02, threshold=4.711e+02, percent-clipped=0.0 2023-09-29 10:23:41,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:23:44,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:23:46,757 INFO [train.py:1039] (0/4) Epoch 10, batch 2400, loss[loss=0.2024, simple_loss=0.282, pruned_loss=0.06139, over 24432.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2764, pruned_loss=0.06871, over 4718897.45 frames. ], batch size: 69, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:23:49,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:23:49,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=334720.0, ans=0.1 2023-09-29 10:23:50,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:23:50,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 10:23:52,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 10:23:59,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:23:59,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:01,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 10:24:02,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:24:04,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:04,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 10:24:07,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=334786.6666666667, ans=0.0 2023-09-29 10:24:08,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:11,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 10:24:13,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=334786.6666666667, ans=0.125 2023-09-29 10:24:18,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:24:23,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 10:24:25,547 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:24:26,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:24:28,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:28,646 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=334853.3333333333, ans=0.1 2023-09-29 10:24:32,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=334853.3333333333, ans=0.0 2023-09-29 10:24:32,690 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=334853.3333333333, ans=0.125 2023-09-29 10:24:33,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:33,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 10:24:33,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:24:41,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:43,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:24:45,041 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=334920.0, ans=0.0 2023-09-29 10:24:46,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:47,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:24:47,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:24:47,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:24:49,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:49,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:24:49,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:24:52,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:24:53,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:24:54,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 10:24:56,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 10:24:57,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:57,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:57,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 10:24:57,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 10:24:57,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 10:24:57,881 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 10:24:59,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=334986.6666666667, ans=0.125 2023-09-29 10:25:00,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 10:25:00,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:25:03,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=334986.6666666667, ans=0.125 2023-09-29 10:25:04,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:04,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:06,435 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 10:25:06,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:07,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:25:09,563 INFO [train.py:1039] (0/4) Epoch 10, batch 2450, loss[loss=0.2095, simple_loss=0.2644, pruned_loss=0.07734, over 23711.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.275, pruned_loss=0.06867, over 4711936.57 frames. ], batch size: 232, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:25:11,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:25:11,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:15,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:15,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:17,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 10:25:22,740 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=15.0 2023-09-29 10:25:23,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:25:23,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:27,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:25:27,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:25:27,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:25:27,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 10:25:32,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:33,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:25:34,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:25:36,557 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=335120.0, ans=0.125 2023-09-29 10:25:39,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:25:39,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:42,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 10:25:43,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:25:51,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:51,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=335186.6666666667, ans=0.2 2023-09-29 10:25:52,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=335186.6666666667, ans=0.125 2023-09-29 10:25:52,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=335186.6666666667, ans=0.125 2023-09-29 10:25:53,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:53,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:25:53,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:25:53,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:54,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:25:56,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 10:26:01,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:26:01,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:26:05,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:05,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:10,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:26:10,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 10:26:11,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:26:11,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=335253.3333333333, ans=0.125 2023-09-29 10:26:12,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:12,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 10:26:12,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:26:15,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:26:20,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:26:21,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:26:21,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:26:24,602 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.022e+02 2.367e+02 2.913e+02 5.353e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 10:26:26,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 10:26:27,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:26:30,830 INFO [train.py:1039] (0/4) Epoch 10, batch 2500, loss[loss=0.2348, simple_loss=0.3039, pruned_loss=0.08288, over 23942.00 frames. ], tot_loss[loss=0.206, simple_loss=0.2748, pruned_loss=0.06858, over 4713363.66 frames. ], batch size: 86, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:26:33,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:43,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:26:43,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:45,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:45,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 10:26:52,578 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.66 vs. limit=10.0 2023-09-29 10:26:52,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:26:53,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:54,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:26:54,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:26:54,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 10:26:55,147 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.65 vs. limit=15.0 2023-09-29 10:26:56,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:57,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:59,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 10:26:59,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:59,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 10:27:00,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:06,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:27:07,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:27:11,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:27:11,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 10:27:13,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:15,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:19,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:25,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:29,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:32,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=335586.6666666667, ans=0.0 2023-09-29 10:27:32,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=335586.6666666667, ans=0.125 2023-09-29 10:27:33,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:27:36,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 10:27:38,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:27:38,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:27:41,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:27:41,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:27:41,455 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 10:27:41,456 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 10:27:42,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 10:27:44,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:46,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 10:27:46,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 10:27:48,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:48,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 10:27:52,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 10:27:53,796 INFO [train.py:1039] (0/4) Epoch 10, batch 2550, loss[loss=0.199, simple_loss=0.2699, pruned_loss=0.06407, over 23643.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2751, pruned_loss=0.0679, over 4732595.36 frames. ], batch size: 149, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:27:56,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:27:57,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:58,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:28:00,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:28:02,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 10:28:03,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:28:06,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 10:28:08,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:28:09,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:12,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:28:12,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 10:28:12,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:14,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:14,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:16,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:28:16,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 10:28:17,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:28:17,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:17,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 10:28:32,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=335853.3333333333, ans=0.125 2023-09-29 10:28:33,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:28:38,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:38,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:40,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:40,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:28:46,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:48,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:48,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:28:48,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:28:49,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:28:49,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:28:53,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:53,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:58,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:28:58,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 10:28:58,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:29:00,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:00,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:29:02,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:29:03,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:09,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:29:10,866 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.915e+02 2.103e+02 2.425e+02 3.393e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 10:29:11,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:14,184 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 10:29:17,125 INFO [train.py:1039] (0/4) Epoch 10, batch 2600, loss[loss=0.2149, simple_loss=0.2751, pruned_loss=0.07729, over 23737.00 frames. ], tot_loss[loss=0.206, simple_loss=0.2756, pruned_loss=0.0682, over 4731768.64 frames. ], batch size: 212, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:29:17,213 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 10:29:17,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:29:17,305 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 10:29:18,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 10:29:18,873 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 10:29:20,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:29:21,894 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 10:29:23,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 10:29:25,448 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 10:29:26,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:29:28,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 10:29:30,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 10:29:32,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:29:32,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 10:29:32,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=336120.0, ans=0.125 2023-09-29 10:29:32,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=336120.0, ans=0.1 2023-09-29 10:29:35,414 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 10:29:35,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 10:29:43,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:43,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:43,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:29:43,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 10:29:46,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:29:51,460 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 10:29:53,813 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-09-29 10:29:56,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:56,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:57,666 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-09-29 10:29:58,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 10:29:58,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:58,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:30:00,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 10:30:03,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:30:03,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:30:05,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,787 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 10:30:09,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:30:17,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:30:19,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:30:19,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 10:30:19,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:30:22,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:30:22,689 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=336320.0, ans=0.0 2023-09-29 10:30:23,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:26,208 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.60 vs. limit=15.0 2023-09-29 10:30:30,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 10:30:32,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:32,888 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.79 vs. limit=15.0 2023-09-29 10:30:33,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:30:38,152 INFO [train.py:1039] (0/4) Epoch 10, batch 2650, loss[loss=0.1776, simple_loss=0.2525, pruned_loss=0.05129, over 24600.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2763, pruned_loss=0.06825, over 4726226.71 frames. ], batch size: 60, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:30:40,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 10:30:40,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:40,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:30:40,681 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 10:30:40,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=336386.6666666667, ans=0.125 2023-09-29 10:30:42,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:30:45,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:47,531 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=336386.6666666667, ans=0.125 2023-09-29 10:30:48,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:30:50,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:52,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:55,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 10:30:55,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:30:55,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:30:58,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 10:30:58,791 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 10:31:01,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:05,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 10:31:05,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:06,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 10:31:11,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:31:11,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:16,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 10:31:16,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 10:31:22,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:31:24,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 10:31:24,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:26,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:26,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:26,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:26,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:28,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:30,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:30,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:31:31,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:31:33,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:31:33,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=336586.6666666667, ans=10.0 2023-09-29 10:31:34,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:36,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:31:39,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:41,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:41,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:31:44,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=336653.3333333333, ans=0.125 2023-09-29 10:31:45,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:45,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:31:45,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:46,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 10:31:46,175 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:31:50,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:52,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=336653.3333333333, ans=0.125 2023-09-29 10:31:53,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:54,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:56,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:31:57,107 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-09-29 10:31:57,510 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.958e+02 2.205e+02 2.606e+02 4.713e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-29 10:31:57,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:59,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:00,537 INFO [train.py:1039] (0/4) Epoch 10, batch 2700, loss[loss=0.1807, simple_loss=0.2563, pruned_loss=0.05256, over 24526.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2773, pruned_loss=0.06916, over 4708042.79 frames. ], batch size: 60, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:32:02,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:02,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 10:32:04,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:06,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:32:07,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:32:07,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=336720.0, ans=0.125 2023-09-29 10:32:09,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:09,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:10,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:32:10,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:32:12,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:32:12,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:32:12,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 10:32:12,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:32:14,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:32:16,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:32:16,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:32:20,013 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.11 vs. limit=15.0 2023-09-29 10:32:20,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:32:20,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 10:32:22,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:32:25,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:32:27,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:32:32,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=336853.3333333333, ans=0.125 2023-09-29 10:32:34,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:32:34,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:34,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:32:34,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:32:37,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:32:42,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:32:42,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:32:42,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:32:49,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:49,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:32:50,104 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.86 vs. limit=15.0 2023-09-29 10:32:55,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=336920.0, ans=0.125 2023-09-29 10:32:57,769 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.15 vs. limit=15.0 2023-09-29 10:32:58,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:58,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:00,825 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-09-29 10:33:01,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:33:01,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:04,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=336920.0, ans=0.1 2023-09-29 10:33:05,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:06,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:06,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:33:09,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:12,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:12,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:13,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:33:13,883 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:33:15,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:15,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:20,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 10:33:21,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:23,837 INFO [train.py:1039] (0/4) Epoch 10, batch 2750, loss[loss=0.2027, simple_loss=0.2745, pruned_loss=0.06551, over 23161.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2771, pruned_loss=0.06908, over 4712030.52 frames. ], batch size: 105, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:33:23,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:33:23,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 10:33:25,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 10:33:25,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:27,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=337053.3333333333, ans=0.1 2023-09-29 10:33:28,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:28,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:32,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:32,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:33:33,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:36,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:33:36,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:33:38,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:33:38,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:38,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 10:33:38,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:33:38,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:40,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=337120.0, ans=0.0 2023-09-29 10:33:40,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=337120.0, ans=0.2 2023-09-29 10:33:42,863 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.50 vs. limit=15.0 2023-09-29 10:33:43,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 10:33:46,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:48,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:48,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:48,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:33:49,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:51,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:33:51,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:53,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:57,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:33:57,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:33:57,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:33:58,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:58,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:33:59,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=337186.6666666667, ans=0.125 2023-09-29 10:34:06,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:34:08,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:34:08,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:13,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:34:13,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:34:15,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:34:15,231 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:34:21,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:34:22,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:34:22,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 10:34:26,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:30,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 10:34:33,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=337320.0, ans=0.09899494936611666 2023-09-29 10:34:34,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:34:37,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:34:37,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 10:34:38,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:34:41,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:34:41,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 10:34:41,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:34:42,599 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.063e+02 2.307e+02 2.543e+02 4.120e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 10:34:46,147 INFO [train.py:1039] (0/4) Epoch 10, batch 2800, loss[loss=0.2085, simple_loss=0.2643, pruned_loss=0.07634, over 23774.00 frames. ], tot_loss[loss=0.2066, simple_loss=0.2762, pruned_loss=0.06852, over 4718914.63 frames. ], batch size: 164, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:34:46,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:34:46,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:34:48,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:34:48,277 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=337386.6666666667, ans=0.0 2023-09-29 10:34:50,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 10:34:50,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:50,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:51,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:51,911 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 10:34:51,912 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 10:34:56,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:59,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:34:59,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:35:03,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:35:06,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 10:35:08,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:35:08,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 10:35:10,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:11,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:35:11,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:16,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:16,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:16,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:35:17,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:26,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:35:28,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:35:29,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:31,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:35:32,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:39,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:39,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 10:35:40,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:40,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:40,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:35:46,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:47,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:51,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:53,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:35:53,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:53,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:35:54,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:35:54,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:35:56,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:56,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 10:35:56,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:35:58,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:58,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:00,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 10:36:02,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:02,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:36:03,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:36:05,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 10:36:10,395 INFO [train.py:1039] (0/4) Epoch 10, batch 2850, loss[loss=0.195, simple_loss=0.2628, pruned_loss=0.06361, over 23218.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2754, pruned_loss=0.06795, over 4724497.42 frames. ], batch size: 105, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:36:12,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:36:12,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:36:13,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:36:15,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:19,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:20,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:36:20,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:36:23,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:23,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:36:25,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:36:25,961 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.65 vs. limit=6.0 2023-09-29 10:36:26,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 10:36:31,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 10:36:31,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:34,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 10:36:35,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:38,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 10:36:38,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 10:36:40,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:55,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:56,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:36:56,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:56,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:36:56,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:36:58,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:37:00,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:37:00,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 10:37:01,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:37:01,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:03,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:03,557 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=337920.0, ans=0.2 2023-09-29 10:37:04,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:06,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:06,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:09,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:11,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:37:14,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:37:16,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:16,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:18,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:37:24,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:37:25,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 10:37:25,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 10:37:27,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:37:27,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=337986.6666666667, ans=0.2 2023-09-29 10:37:28,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:28,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 10:37:28,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:37:30,091 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.039e+02 2.270e+02 2.590e+02 3.840e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-29 10:37:30,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:31,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:32,261 INFO [train.py:1039] (0/4) Epoch 10, batch 2900, loss[loss=0.2256, simple_loss=0.2884, pruned_loss=0.08143, over 23621.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2758, pruned_loss=0.06739, over 4731848.44 frames. ], batch size: 149, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:37:32,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:37:32,331 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 10:37:32,388 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 10:37:32,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:32,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:37,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:37:37,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:37,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:38,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 10:37:39,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=338053.3333333333, ans=0.035 2023-09-29 10:37:39,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=338053.3333333333, ans=0.125 2023-09-29 10:37:43,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:43,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 10:37:45,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 10:37:47,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:37:47,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:37:49,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:51,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:55,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:55,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:57,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:37:58,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 10:37:58,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:38:00,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:00,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=338120.0, ans=10.0 2023-09-29 10:38:04,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 10:38:05,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 10:38:05,871 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:38:07,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:38:07,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 10:38:07,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:38:08,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:38:08,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:38:11,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:38:12,782 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.29 vs. limit=15.0 2023-09-29 10:38:13,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:15,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:38:18,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:20,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 10:38:20,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 10:38:20,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:38:25,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:38:28,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 10:38:28,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:38:28,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=338253.3333333333, ans=0.125 2023-09-29 10:38:28,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=338253.3333333333, ans=0.125 2023-09-29 10:38:34,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:34,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=338253.3333333333, ans=0.0 2023-09-29 10:38:38,846 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-09-29 10:38:44,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:38:44,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:38:45,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 10:38:49,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:49,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 10:38:49,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:50,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:38:55,099 INFO [train.py:1039] (0/4) Epoch 10, batch 2950, loss[loss=0.2308, simple_loss=0.2922, pruned_loss=0.08467, over 23691.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2768, pruned_loss=0.06853, over 4720442.37 frames. ], batch size: 232, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:38:55,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:58,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 10:38:58,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:38:58,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:01,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:02,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:39:03,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=338386.6666666667, ans=0.125 2023-09-29 10:39:04,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 10:39:04,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 10:39:04,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:39:04,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:39:13,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:14,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:16,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:39:16,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:19,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:39:19,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:39:20,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:22,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:22,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=338453.3333333333, ans=0.125 2023-09-29 10:39:24,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:39:25,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 10:39:31,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 10:39:31,323 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 10:39:32,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:39:34,423 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 10:39:35,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 10:39:35,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:37,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:37,418 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 10:39:37,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:39:40,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 10:39:40,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=338520.0, ans=0.0 2023-09-29 10:39:41,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:42,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:39:46,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:47,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:39:47,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:49,069 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 10:39:50,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:50,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 10:39:56,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:58,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:39:59,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 10:39:59,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:40:03,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 10:40:07,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:07,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=338653.3333333333, ans=0.04949747468305833 2023-09-29 10:40:08,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:40:08,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:40:10,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:40:10,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:40:13,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:40:13,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:13,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:40:13,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:40:15,012 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.713e+02 2.095e+02 2.360e+02 2.809e+02 3.974e+02, threshold=4.720e+02, percent-clipped=0.0 2023-09-29 10:40:15,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:16,513 INFO [train.py:1039] (0/4) Epoch 10, batch 3000, loss[loss=0.2147, simple_loss=0.286, pruned_loss=0.07168, over 23297.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2769, pruned_loss=0.06865, over 4728272.59 frames. ], batch size: 105, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:40:16,514 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 10:40:31,294 INFO [train.py:1071] (0/4) Epoch 10, validation: loss=0.2858, simple_loss=0.2843, pruned_loss=0.1436, over 1125622.00 frames. 2023-09-29 10:40:31,295 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 10:40:31,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:40:33,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:33,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 10:40:34,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:38,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:40:38,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:40:43,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 10:40:43,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 10:40:46,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:40:46,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:40:47,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 10:40:49,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:40:51,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=338786.6666666667, ans=0.2 2023-09-29 10:40:55,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:40:58,384 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.86 vs. limit=15.0 2023-09-29 10:41:06,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:41:14,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 10:41:16,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:41:17,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=338853.3333333333, ans=0.125 2023-09-29 10:41:19,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:41:19,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:41:19,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:41:21,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:21,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 10:41:22,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 10:41:24,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:41:24,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:41:28,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:41:28,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:28,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:28,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:41:35,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:41:35,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:35,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:41:36,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:39,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 10:41:40,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:41:40,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:41:40,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:41:41,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=338986.6666666667, ans=0.125 2023-09-29 10:41:45,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:45,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:47,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:41:47,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 10:41:47,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:41:49,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 10:41:49,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:41:52,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 10:41:54,031 INFO [train.py:1039] (0/4) Epoch 10, batch 3050, loss[loss=0.189, simple_loss=0.2661, pruned_loss=0.05595, over 24628.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2771, pruned_loss=0.06877, over 4722690.21 frames. ], batch size: 60, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:41:54,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:41:54,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:41:54,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 10:41:55,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 10:41:55,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:41:57,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:41:58,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:58,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:41:58,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:00,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:42:01,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 10:42:05,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:07,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:07,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:42:10,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:15,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 10:42:15,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=339120.0, ans=0.125 2023-09-29 10:42:20,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=339120.0, ans=0.0 2023-09-29 10:42:20,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=339120.0, ans=0.0 2023-09-29 10:42:22,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 10:42:22,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 10:42:22,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:25,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=339186.6666666667, ans=0.125 2023-09-29 10:42:27,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:42:27,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=339186.6666666667, ans=0.0 2023-09-29 10:42:29,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=339186.6666666667, ans=0.125 2023-09-29 10:42:30,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:30,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:31,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:34,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:42:34,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:42:34,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:36,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:36,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:37,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:38,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:40,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:41,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 10:42:43,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:43,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:42:46,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:46,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:42:48,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:42:48,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:42:53,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:55,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:02,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:02,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:02,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:03,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:05,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:43:05,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:43:05,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=339320.0, ans=0.125 2023-09-29 10:43:07,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 10:43:07,801 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-09-29 10:43:09,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:09,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:10,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 10:43:12,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:15,093 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.990e+02 2.334e+02 2.687e+02 4.208e+02, threshold=4.668e+02, percent-clipped=0.0 2023-09-29 10:43:16,562 INFO [train.py:1039] (0/4) Epoch 10, batch 3100, loss[loss=0.1885, simple_loss=0.2711, pruned_loss=0.05293, over 24676.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2764, pruned_loss=0.06816, over 4733741.47 frames. ], batch size: 73, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:43:18,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:19,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:43:21,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:43:25,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 10:43:28,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 10:43:28,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 10:43:28,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=339386.6666666667, ans=0.1 2023-09-29 10:43:29,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:43:34,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:43:34,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:36,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:43:38,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=339453.3333333333, ans=0.0 2023-09-29 10:43:41,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:47,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 10:43:50,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:43:50,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:52,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:43:52,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:53,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:43:55,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:43:55,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 10:43:55,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:43:56,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:58,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 10:44:00,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:04,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:44:04,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=339586.6666666667, ans=0.0 2023-09-29 10:44:05,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 10:44:07,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 10:44:08,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:08,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:44:10,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:10,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:10,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:44:12,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:44:12,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:44:14,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:44:14,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=339586.6666666667, ans=0.0 2023-09-29 10:44:15,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:44:15,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:15,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 10:44:18,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:44:20,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 10:44:23,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:44:23,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 10:44:25,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:25,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:25,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 10:44:25,905 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.54 vs. limit=15.0 2023-09-29 10:44:37,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 10:44:38,110 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.86 vs. limit=15.0 2023-09-29 10:44:38,895 INFO [train.py:1039] (0/4) Epoch 10, batch 3150, loss[loss=0.1946, simple_loss=0.2432, pruned_loss=0.07303, over 22650.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2748, pruned_loss=0.0678, over 4734676.91 frames. ], batch size: 322, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:44:41,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:41,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:43,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:44:43,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:44:45,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 10:44:45,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:47,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:44:48,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 10:44:48,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:51,716 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 10:44:54,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 10:44:56,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:56,345 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 10:44:57,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:44:59,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 10:44:59,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 10:44:59,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 10:44:59,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:59,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:00,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:45:02,555 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=339786.6666666667, ans=0.2 2023-09-29 10:45:04,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 10:45:06,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:06,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:07,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:07,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=339786.6666666667, ans=0.125 2023-09-29 10:45:09,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:45:13,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 10:45:14,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:45:16,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:45:16,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:17,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 10:45:22,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 10:45:22,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:45:23,640 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.65 vs. limit=15.0 2023-09-29 10:45:24,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:45:24,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:45:24,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:24,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:45:24,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=339853.3333333333, ans=0.0 2023-09-29 10:45:25,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:45:25,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:45:27,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 10:45:27,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:45:27,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:28,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:45:30,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:31,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 10:45:31,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:33,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 10:45:34,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:34,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 10:45:37,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 10:45:37,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=339920.0, ans=0.2 2023-09-29 10:45:38,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:45:40,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:40,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 10:45:42,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:45:43,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:47,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:45:49,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:49,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:45:55,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:45:55,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:57,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:45:58,678 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 1.980e+02 2.239e+02 2.536e+02 3.813e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 10:46:00,290 INFO [train.py:1039] (0/4) Epoch 10, batch 3200, loss[loss=0.2028, simple_loss=0.2777, pruned_loss=0.06391, over 24131.00 frames. ], tot_loss[loss=0.204, simple_loss=0.2742, pruned_loss=0.0669, over 4751287.84 frames. ], batch size: 80, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:46:00,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=340053.3333333333, ans=0.0 2023-09-29 10:46:04,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:46:04,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:46:08,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:09,418 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.16 vs. limit=15.0 2023-09-29 10:46:10,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:46:10,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 10:46:11,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:46:17,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:46:21,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:31,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:46:42,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 10:46:42,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:46:44,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 10:46:45,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:46:49,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:46:49,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:46:51,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:46:56,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 10:46:58,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:46:59,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 10:47:01,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 10:47:04,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:47:09,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:10,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:47:10,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:11,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=340320.0, ans=0.0 2023-09-29 10:47:12,460 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 10:47:12,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:47:15,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:17,527 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:47:18,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 10:47:18,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 10:47:19,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=340320.0, ans=0.0 2023-09-29 10:47:20,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 10:47:21,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 10:47:23,683 INFO [train.py:1039] (0/4) Epoch 10, batch 3250, loss[loss=0.2037, simple_loss=0.2864, pruned_loss=0.06054, over 24484.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.274, pruned_loss=0.06738, over 4752209.91 frames. ], batch size: 69, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:47:23,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:47:29,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:47:29,638 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 10:47:29,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:47:29,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:32,564 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 10:47:37,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:47:39,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=340453.3333333333, ans=0.0 2023-09-29 10:47:40,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:47:47,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:47:47,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 10:47:48,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:48,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:48,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:47:50,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:47:50,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:47:53,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:54,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:47:56,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:47:56,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:47:59,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:00,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:48:03,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:03,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:48:04,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:04,721 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.70 vs. limit=6.0 2023-09-29 10:48:06,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:48:06,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:11,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 10:48:11,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:48:11,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:48:12,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:14,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:48:14,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=340586.6666666667, ans=0.0 2023-09-29 10:48:17,968 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=340586.6666666667, ans=0.125 2023-09-29 10:48:19,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:48:28,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:29,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:29,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 10:48:29,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:48:29,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:48:29,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:31,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 10:48:31,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 10:48:33,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:34,255 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.94 vs. limit=15.0 2023-09-29 10:48:34,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:36,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:36,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:48:37,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:42,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:48:42,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:44,372 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 2.128e+02 2.401e+02 2.998e+02 4.766e+02, threshold=4.802e+02, percent-clipped=1.0 2023-09-29 10:48:44,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 10:48:44,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:48:46,490 INFO [train.py:1039] (0/4) Epoch 10, batch 3300, loss[loss=0.2053, simple_loss=0.2693, pruned_loss=0.07069, over 23473.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2745, pruned_loss=0.06771, over 4745475.21 frames. ], batch size: 134, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:48:46,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:48:46,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 10:48:49,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:51,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 10:48:52,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 10:48:52,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 10:48:53,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:56,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:57,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:48:57,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:59,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:49:00,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:49:03,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:05,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:09,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=340786.6666666667, ans=0.0 2023-09-29 10:49:10,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 10:49:12,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:12,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:14,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:15,718 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 10:49:17,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:49:17,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:49:18,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:49:19,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:49:20,907 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 10:49:22,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:22,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:49:24,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:24,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 10:49:25,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 10:49:25,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:27,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:49:30,405 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 10:49:33,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 10:49:33,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:49:35,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 10:49:38,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:49:39,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:49:41,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:49:45,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:45,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:45,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:45,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:49:48,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:49:48,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:49,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:49:50,631 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 10:49:52,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 10:49:55,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:49:57,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:57,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:49:58,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:58,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:00,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:50:00,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:00,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:50:00,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:03,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:50:05,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 10:50:05,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:06,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:08,236 INFO [train.py:1039] (0/4) Epoch 10, batch 3350, loss[loss=0.2141, simple_loss=0.2787, pruned_loss=0.07475, over 23719.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2762, pruned_loss=0.06908, over 4735864.83 frames. ], batch size: 232, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:50:08,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:50:08,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:50:09,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:11,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:11,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:15,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:50:17,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:19,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:50:23,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:23,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:50:26,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:27,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:50:28,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 10:50:30,642 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 10:50:30,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:32,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=341120.0, ans=0.0 2023-09-29 10:50:35,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 10:50:35,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 10:50:35,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:50:35,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:50:38,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:50:38,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 10:50:38,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:38,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:50:41,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:42,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:44,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:45,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:50:46,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=341186.6666666667, ans=10.0 2023-09-29 10:50:49,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:52,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:52,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:56,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:56,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:00,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:00,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:00,256 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=341253.3333333333, ans=0.1 2023-09-29 10:51:01,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:05,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 10:51:05,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:51:05,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 10:51:05,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:51:06,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 10:51:08,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:09,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:13,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=341320.0, ans=0.05 2023-09-29 10:51:16,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:17,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 10:51:18,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:20,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:51:22,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:51:28,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:30,308 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.041e+02 2.250e+02 2.635e+02 4.628e+02, threshold=4.499e+02, percent-clipped=0.0 2023-09-29 10:51:30,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 10:51:31,846 INFO [train.py:1039] (0/4) Epoch 10, batch 3400, loss[loss=0.2192, simple_loss=0.2817, pruned_loss=0.07835, over 23343.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2769, pruned_loss=0.06924, over 4728820.46 frames. ], batch size: 119, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:51:31,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:51:32,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:51:33,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:35,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 10:51:37,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:37,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 10:51:38,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:39,543 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.93 vs. limit=6.0 2023-09-29 10:51:40,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:40,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:51:41,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:51:41,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 10:51:43,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 10:51:43,880 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 10:51:45,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:50,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:50,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:51,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:51:52,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:51:56,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=341453.3333333333, ans=0.125 2023-09-29 10:51:59,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:51:59,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 10:52:04,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:52:08,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:09,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:11,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:52:16,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:52:18,504 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=341520.0, ans=0.0 2023-09-29 10:52:19,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 10:52:24,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 10:52:25,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:52:27,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:27,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:52:28,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:52:32,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:35,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:52:35,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:52:42,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:52:44,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 10:52:46,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=341653.3333333333, ans=0.2 2023-09-29 10:52:51,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:52:53,945 INFO [train.py:1039] (0/4) Epoch 10, batch 3450, loss[loss=0.2139, simple_loss=0.2933, pruned_loss=0.06719, over 24531.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2764, pruned_loss=0.06912, over 4724508.10 frames. ], batch size: 71, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:52:55,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 10:53:00,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 10:53:00,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:01,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:53:01,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 10:53:03,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:53:03,765 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.89 vs. limit=15.0 2023-09-29 10:53:03,819 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.79 vs. limit=15.0 2023-09-29 10:53:06,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:53:08,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=341786.6666666667, ans=0.125 2023-09-29 10:53:10,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:53:10,914 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-09-29 10:53:12,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:13,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:53:14,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:14,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=341786.6666666667, ans=0.95 2023-09-29 10:53:16,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:17,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=341786.6666666667, ans=0.125 2023-09-29 10:53:24,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 10:53:27,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=341853.3333333333, ans=0.125 2023-09-29 10:53:28,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 10:53:28,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:53:28,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:53:30,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:30,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=341853.3333333333, ans=0.0 2023-09-29 10:53:35,153 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=341853.3333333333, ans=0.2 2023-09-29 10:53:36,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 10:53:37,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:53:41,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:53:41,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:43,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:53:45,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:53:46,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 10:53:46,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:53:50,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:52,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:53:55,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 10:53:59,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:54:04,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:54:05,047 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:54:06,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:09,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:12,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:12,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:54:12,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:54:12,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:54:16,391 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 2.010e+02 2.253e+02 2.520e+02 3.608e+02, threshold=4.507e+02, percent-clipped=0.0 2023-09-29 10:54:16,435 INFO [train.py:1039] (0/4) Epoch 10, batch 3500, loss[loss=0.184, simple_loss=0.2606, pruned_loss=0.05375, over 21670.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2752, pruned_loss=0.0686, over 4721735.54 frames. ], batch size: 47, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:54:18,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:21,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:54:24,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 10:54:25,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:54:28,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 10:54:29,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=342053.3333333333, ans=0.125 2023-09-29 10:54:31,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:31,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 10:54:35,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:54:36,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:54:38,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:54:38,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:54:38,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:54:39,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:39,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:39,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 10:54:42,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:44,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:54:46,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:50,298 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.85 vs. limit=15.0 2023-09-29 10:54:51,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:51,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 10:54:51,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:54,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:55,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:54:58,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:59,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:54:59,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:01,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 10:55:01,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 10:55:02,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 10:55:04,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:05,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:05,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:07,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:55:10,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:55:10,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:55:14,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=342253.3333333333, ans=0.1 2023-09-29 10:55:15,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:16,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 10:55:16,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 10:55:16,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:55:20,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:20,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:22,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:25,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 10:55:26,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:27,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:29,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 10:55:30,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 10:55:30,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=342320.0, ans=0.125 2023-09-29 10:55:32,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=342320.0, ans=0.125 2023-09-29 10:55:33,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:35,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:35,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:35,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:38,070 INFO [train.py:1039] (0/4) Epoch 10, batch 3550, loss[loss=0.1769, simple_loss=0.2441, pruned_loss=0.05483, over 13765.00 frames. ], tot_loss[loss=0.2048, simple_loss=0.273, pruned_loss=0.0683, over 4688021.05 frames. ], batch size: 29, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:55:39,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:55:44,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:45,614 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.19 vs. limit=6.0 2023-09-29 10:55:47,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:55:50,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:52,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:55:55,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:55,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:55:56,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:55:57,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=342453.3333333333, ans=0.125 2023-09-29 10:56:00,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:00,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:56:00,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:01,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:56:02,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:56:08,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:56:08,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:10,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:10,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:11,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:56:11,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 10:56:11,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:13,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:14,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:56:19,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:19,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:56:21,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:24,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 10:56:24,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:56:26,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 10:56:26,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:30,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:56:30,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:56:33,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 10:56:35,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:39,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=342586.6666666667, ans=0.2 2023-09-29 10:56:40,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:42,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 10:56:42,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:47,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:48,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 10:56:49,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=342653.3333333333, ans=0.125 2023-09-29 10:56:55,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 10:56:55,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:56:55,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:56:57,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:57:02,291 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.051e+02 2.303e+02 2.669e+02 4.347e+02, threshold=4.606e+02, percent-clipped=0.0 2023-09-29 10:57:02,333 INFO [train.py:1039] (0/4) Epoch 10, batch 3600, loss[loss=0.1924, simple_loss=0.2682, pruned_loss=0.0583, over 24497.00 frames. ], tot_loss[loss=0.2049, simple_loss=0.2734, pruned_loss=0.06815, over 4703228.19 frames. ], batch size: 63, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:57:03,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:05,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:07,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:57:07,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:57:09,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:09,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 10:57:14,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:57:15,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:17,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=342786.6666666667, ans=0.125 2023-09-29 10:57:20,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:22,538 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.67 vs. limit=15.0 2023-09-29 10:57:23,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:24,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:57:24,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:26,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 10:57:26,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:28,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:29,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:57:31,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:32,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=342786.6666666667, ans=0.0 2023-09-29 10:57:33,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:33,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:57:36,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 10:57:36,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=342853.3333333333, ans=0.2 2023-09-29 10:57:43,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:57:45,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:57:45,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 10:57:49,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:57:53,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:56,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:57,404 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.45 vs. limit=15.0 2023-09-29 10:58:01,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:58:01,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:58:01,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 10:58:05,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 10:58:05,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 10:58:08,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:58:08,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:58:09,112 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=342986.6666666667, ans=0.125 2023-09-29 10:58:09,598 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-09-29 10:58:10,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 10:58:10,703 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=342986.6666666667, ans=0.125 2023-09-29 10:58:11,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:11,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:58:11,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:13,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 10:58:14,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 10:58:18,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:58:18,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 10:58:25,478 INFO [train.py:1039] (0/4) Epoch 10, batch 3650, loss[loss=0.2328, simple_loss=0.2871, pruned_loss=0.08926, over 22819.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2754, pruned_loss=0.06915, over 4699599.17 frames. ], batch size: 322, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:58:25,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 10:58:25,949 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=343053.3333333333, ans=0.2 2023-09-29 10:58:27,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:58:30,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 10:58:32,775 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=8.60 vs. limit=15.0 2023-09-29 10:58:33,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 10:58:36,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:58:36,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:58:36,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:58:41,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:58:41,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:42,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 10:58:44,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:58:45,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:45,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 10:58:46,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:58:48,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:58:48,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:58:49,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:58:52,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 10:58:53,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=343120.0, ans=0.07 2023-09-29 10:58:54,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 10:58:54,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:58:56,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 10:58:59,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:58:59,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:59:02,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:59:04,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:04,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:59:06,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:59:06,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:59:07,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:59:12,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:14,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:14,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:59:14,433 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=343253.3333333333, ans=0.125 2023-09-29 10:59:17,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:59:19,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:19,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:25,669 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 10:59:29,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:29,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:31,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:59:31,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:33,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:59:33,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=343320.0, ans=0.0 2023-09-29 10:59:35,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:37,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 10:59:37,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:40,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:59:43,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:43,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:59:44,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=343320.0, ans=0.0 2023-09-29 10:59:47,274 INFO [train.py:1039] (0/4) Epoch 10, batch 3700, loss[loss=0.1713, simple_loss=0.2483, pruned_loss=0.04721, over 24598.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2756, pruned_loss=0.06876, over 4713815.04 frames. ], batch size: 60, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:59:47,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:47,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 10:59:47,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:48,842 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.980e+02 2.218e+02 2.465e+02 4.377e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 10:59:48,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:59:49,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:59:53,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:59:57,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:58,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:59:58,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:59:58,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:00:00,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:00:02,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:04,359 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 11:00:04,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=343453.3333333333, ans=0.125 2023-09-29 11:00:11,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=343453.3333333333, ans=0.07 2023-09-29 11:00:13,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:00:15,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:00:17,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:00:17,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 11:00:17,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:20,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:20,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 11:00:20,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=343520.0, ans=0.1 2023-09-29 11:00:22,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:23,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:00:24,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=343520.0, ans=0.09899494936611666 2023-09-29 11:00:25,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:25,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:00:28,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:00:32,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:32,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 11:00:34,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:34,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 11:00:41,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:00:41,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:00:44,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:44,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 11:00:45,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:00:45,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:00:46,190 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=343586.6666666667, ans=0.0 2023-09-29 11:00:46,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=343586.6666666667, ans=0.125 2023-09-29 11:00:47,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:47,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:47,666 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=343586.6666666667, ans=0.0 2023-09-29 11:00:50,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:52,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 11:00:53,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 11:00:55,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:00:55,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:00:57,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:00:58,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:01:01,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:01:03,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:01:04,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:07,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 11:01:08,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:01:10,105 INFO [train.py:1039] (0/4) Epoch 10, batch 3750, loss[loss=0.209, simple_loss=0.2733, pruned_loss=0.07234, over 23488.00 frames. ], tot_loss[loss=0.2086, simple_loss=0.2777, pruned_loss=0.06973, over 4699931.53 frames. ], batch size: 134, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:01:11,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:01:11,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 11:01:13,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:01:15,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:01:17,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343720.0, ans=0.1 2023-09-29 11:01:20,848 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=343720.0, ans=0.5 2023-09-29 11:01:22,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:26,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:01:28,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:01:28,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343786.6666666667, ans=0.1 2023-09-29 11:01:30,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:01:31,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:33,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 11:01:33,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:35,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:35,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:38,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 11:01:39,401 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.01 vs. limit=22.5 2023-09-29 11:01:43,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 11:01:45,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:45,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:47,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:52,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:55,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:01:58,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 11:02:02,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:08,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:02:08,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:02:11,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:02:16,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:02:18,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:02:21,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:02:23,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:02:24,141 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-09-29 11:02:25,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:02:33,598 INFO [train.py:1039] (0/4) Epoch 10, batch 3800, loss[loss=0.1958, simple_loss=0.2504, pruned_loss=0.07061, over 23589.00 frames. ], tot_loss[loss=0.2082, simple_loss=0.2773, pruned_loss=0.06957, over 4703092.80 frames. ], batch size: 256, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:02:33,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=344053.3333333333, ans=0.125 2023-09-29 11:02:35,057 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.030e+02 2.447e+02 3.016e+02 6.033e+02, threshold=4.894e+02, percent-clipped=2.0 2023-09-29 11:02:35,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:02:38,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:38,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:02:40,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 11:02:40,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:43,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:02:44,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:02:45,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:02:45,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:46,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:02:48,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:48,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:02:50,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:02:50,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 11:02:53,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:02:53,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:02:58,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:02,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:03:03,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:03:05,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:03:05,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:05,792 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=344186.6666666667, ans=0.125 2023-09-29 11:03:08,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:08,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=344186.6666666667, ans=0.125 2023-09-29 11:03:11,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:11,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=344186.6666666667, ans=0.125 2023-09-29 11:03:16,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:03:16,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 11:03:17,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:21,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=344253.3333333333, ans=10.0 2023-09-29 11:03:23,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:23,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=344253.3333333333, ans=0.1 2023-09-29 11:03:29,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:03:33,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 11:03:34,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 11:03:36,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:37,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:39,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:41,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 11:03:41,503 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=344320.0, ans=0.125 2023-09-29 11:03:44,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 11:03:44,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 11:03:44,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:45,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:47,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=344320.0, ans=0.2 2023-09-29 11:03:50,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:03:52,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:03:55,614 INFO [train.py:1039] (0/4) Epoch 10, batch 3850, loss[loss=0.173, simple_loss=0.2462, pruned_loss=0.04992, over 17306.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2755, pruned_loss=0.06805, over 4714167.25 frames. ], batch size: 37, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:03:58,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:04:00,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 11:04:02,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:04:03,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:07,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:04:10,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:12,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:04:12,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 11:04:17,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:19,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:21,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:21,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:04:25,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:25,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:04:25,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:25,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:04:28,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:31,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:31,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:31,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:04:32,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=344520.0, ans=0.125 2023-09-29 11:04:33,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 11:04:33,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 11:04:35,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:35,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:40,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:40,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:40,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 11:04:43,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 11:04:44,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:47,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 11:04:49,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:04:54,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:55,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:59,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:01,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 11:05:04,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 11:05:07,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:07,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:11,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:05:11,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:05:12,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:05:14,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 11:05:14,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=344653.3333333333, ans=0.125 2023-09-29 11:05:15,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:05:17,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 11:05:17,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:19,079 INFO [train.py:1039] (0/4) Epoch 10, batch 3900, loss[loss=0.2226, simple_loss=0.2975, pruned_loss=0.07387, over 24418.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2741, pruned_loss=0.06732, over 4710455.73 frames. ], batch size: 77, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:05:19,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:19,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:05:20,642 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.918e+02 2.148e+02 2.537e+02 4.144e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-29 11:05:20,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:22,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:05:22,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:22,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:22,640 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:05:23,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:23,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 11:05:23,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:29,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:29,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:30,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:05:32,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:32,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=344720.0, ans=0.1 2023-09-29 11:05:34,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:34,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:35,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:05:36,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 11:05:36,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:05:39,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 11:05:39,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:39,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 11:05:42,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 11:05:45,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=344786.6666666667, ans=0.1 2023-09-29 11:05:46,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:48,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:48,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:05:48,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:05:52,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:54,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=344853.3333333333, ans=0.125 2023-09-29 11:05:55,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:05:56,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:05:56,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:05:58,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:06:03,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:03,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:06:07,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=344853.3333333333, ans=0.0 2023-09-29 11:06:12,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:06:14,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:06:24,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:06:28,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:30,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 11:06:30,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 11:06:30,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:30,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=344986.6666666667, ans=0.1 2023-09-29 11:06:33,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 11:06:34,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:06:34,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 11:06:42,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:42,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 11:06:43,928 INFO [train.py:1039] (0/4) Epoch 10, batch 3950, loss[loss=0.2125, simple_loss=0.2921, pruned_loss=0.06644, over 24421.00 frames. ], tot_loss[loss=0.2043, simple_loss=0.2741, pruned_loss=0.06728, over 4709683.49 frames. ], batch size: 77, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:06:44,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:06:47,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:06:48,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:06:58,406 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 11:06:58,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:06:58,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 11:07:00,502 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 11:07:00,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:01,604 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=12.29 vs. limit=15.0 2023-09-29 11:07:03,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:03,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:07:03,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:07,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 11:07:08,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:07:10,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:07:10,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:07:10,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:07:10,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:07:22,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:07:22,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:07:29,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 11:07:34,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 11:07:34,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 11:07:36,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:07:37,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:07:42,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=345253.3333333333, ans=0.1 2023-09-29 11:07:47,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:07:47,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:07:47,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:47,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:07:47,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 11:07:56,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:07:57,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:07:58,229 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.63 vs. limit=15.0 2023-09-29 11:07:59,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 11:07:59,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=345320.0, ans=0.0 2023-09-29 11:07:59,680 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=345320.0, ans=10.0 2023-09-29 11:08:07,688 INFO [train.py:1039] (0/4) Epoch 10, batch 4000, loss[loss=0.1932, simple_loss=0.267, pruned_loss=0.0597, over 23323.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.275, pruned_loss=0.06776, over 4720349.36 frames. ], batch size: 93, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 11:08:09,113 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.046e+02 2.407e+02 2.777e+02 6.014e+02, threshold=4.814e+02, percent-clipped=1.0 2023-09-29 11:08:10,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:17,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:22,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:23,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:08:23,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:23,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 11:08:25,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:08:26,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 11:08:26,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:08:26,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 11:08:31,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:31,446 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:08:34,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:08:34,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:08:34,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:08:34,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:34,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:08:34,752 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.42 vs. limit=10.0 2023-09-29 11:08:37,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:08:39,353 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 11:08:40,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:08:41,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:08:41,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=345520.0, ans=0.1 2023-09-29 11:08:43,977 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 11:08:44,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:08:44,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:08:44,448 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=345520.0, ans=0.0 2023-09-29 11:08:44,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=345520.0, ans=0.04949747468305833 2023-09-29 11:08:50,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=345520.0, ans=0.1 2023-09-29 11:08:52,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 11:08:53,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:55,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:08:55,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=345586.6666666667, ans=0.125 2023-09-29 11:08:56,859 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 11:08:58,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:08:58,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 11:08:58,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:09:00,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:01,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:09:03,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:09:03,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:09:05,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:09:08,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 11:09:08,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:10,451 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 11:09:16,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:09:18,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:09:20,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:09:21,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:23,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:09:24,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:25,688 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.19 vs. limit=6.0 2023-09-29 11:09:29,485 INFO [train.py:1039] (0/4) Epoch 10, batch 4050, loss[loss=0.2063, simple_loss=0.2835, pruned_loss=0.06455, over 24462.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2746, pruned_loss=0.06728, over 4736109.17 frames. ], batch size: 66, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:09:29,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:32,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:09:34,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 11:09:36,112 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=345720.0, ans=0.2 2023-09-29 11:09:36,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=345720.0, ans=0.125 2023-09-29 11:09:37,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:09:37,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:09:38,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:09:38,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:09:40,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:44,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:49,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:09:49,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:09:50,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:09:50,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:09:55,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:57,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:10:00,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 11:10:01,128 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=345853.3333333333, ans=0.0 2023-09-29 11:10:02,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 11:10:03,760 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 11:10:05,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:10:11,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 11:10:11,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:16,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:20,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:10:22,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:10:22,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:25,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:10:28,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 11:10:28,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:10:29,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:32,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 11:10:36,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:44,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 11:10:45,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:45,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:10:47,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 11:10:47,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 11:10:47,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:10:50,229 INFO [train.py:1039] (0/4) Epoch 10, batch 4100, loss[loss=0.2166, simple_loss=0.2743, pruned_loss=0.07944, over 23785.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.275, pruned_loss=0.06793, over 4727058.46 frames. ], batch size: 149, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:10:50,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:10:52,483 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.996e+02 2.168e+02 2.450e+02 3.987e+02, threshold=4.335e+02, percent-clipped=0.0 2023-09-29 11:10:52,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:10:52,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:11:01,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 11:11:04,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 11:11:05,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 11:11:08,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 11:11:08,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:09,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:11:11,330 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 11:11:14,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:14,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:11:15,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:17,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:11:20,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:11:21,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:23,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:11:23,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 11:11:23,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:23,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:11:23,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:23,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:11:24,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 11:11:25,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=346186.6666666667, ans=0.125 2023-09-29 11:11:26,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:28,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 11:11:30,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:11:32,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:32,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 11:11:35,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:11:35,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:11:37,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:11:38,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 11:11:40,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:11:40,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:11:44,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 11:11:44,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:45,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:11:48,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:53,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:11:56,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:11:58,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:12:06,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:06,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:12:09,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:12:12,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:12:13,814 INFO [train.py:1039] (0/4) Epoch 10, batch 4150, loss[loss=0.2098, simple_loss=0.2661, pruned_loss=0.07672, over 23590.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2752, pruned_loss=0.06809, over 4731575.65 frames. ], batch size: 256, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:12:17,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:12:17,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=346386.6666666667, ans=0.1 2023-09-29 11:12:19,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:12:19,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:12:19,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:23,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 11:12:23,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:23,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 11:12:23,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 11:12:25,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 11:12:26,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:31,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:12:31,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:36,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:12:38,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:12:39,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:12:40,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:12:40,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:41,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=346453.3333333333, ans=0.1 2023-09-29 11:12:42,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:12:46,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:52,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:12:52,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 11:12:56,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 11:12:56,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:12:56,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 11:12:56,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:12:56,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:13:01,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:01,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:04,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 11:13:06,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:08,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:08,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=346586.6666666667, ans=0.125 2023-09-29 11:13:10,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 11:13:11,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:13:13,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 11:13:15,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:13:18,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:13:18,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:19,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 11:13:19,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:19,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:13:21,680 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-52000.pt 2023-09-29 11:13:25,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:13:28,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 11:13:29,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:29,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:13:29,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:13:30,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 11:13:31,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:31,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:13:31,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:13:33,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:33,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 11:13:35,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:39,465 INFO [train.py:1039] (0/4) Epoch 10, batch 4200, loss[loss=0.2174, simple_loss=0.2742, pruned_loss=0.08034, over 23795.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2745, pruned_loss=0.06788, over 4728810.54 frames. ], batch size: 164, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:13:39,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:13:41,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 11:13:42,742 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.147e+02 2.478e+02 3.007e+02 3.865e+02, threshold=4.955e+02, percent-clipped=0.0 2023-09-29 11:13:42,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:13:45,427 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=346720.0, ans=0.1 2023-09-29 11:13:46,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:13:48,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:13:49,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:49,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:51,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 11:13:52,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.81 vs. limit=5.0 2023-09-29 11:13:53,848 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.98 vs. limit=15.0 2023-09-29 11:13:54,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 11:13:54,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:56,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:59,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:14:03,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:14:06,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:06,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:07,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 11:14:07,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:14:07,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=346786.6666666667, ans=0.2 2023-09-29 11:14:09,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:10,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:14:10,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:14:10,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=346853.3333333333, ans=0.025 2023-09-29 11:14:12,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:14:14,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff3.min_abs, batch_count=346853.3333333333, ans=0.2 2023-09-29 11:14:15,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 11:14:15,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:20,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:14:20,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:14:22,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:14:23,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:14:25,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:14:26,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 11:14:27,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:27,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:14:32,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:14:35,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:42,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:14:43,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=346986.6666666667, ans=0.07 2023-09-29 11:14:44,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 11:14:46,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=346986.6666666667, ans=0.125 2023-09-29 11:14:47,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:48,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=346986.6666666667, ans=0.125 2023-09-29 11:14:53,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:14:54,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:14:55,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 11:15:00,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=347053.3333333333, ans=0.125 2023-09-29 11:15:00,506 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=347053.3333333333, ans=0.1 2023-09-29 11:15:01,585 INFO [train.py:1039] (0/4) Epoch 10, batch 4250, loss[loss=0.1739, simple_loss=0.2416, pruned_loss=0.05308, over 24294.00 frames. ], tot_loss[loss=0.2042, simple_loss=0.2732, pruned_loss=0.06757, over 4724756.98 frames. ], batch size: 56, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:15:03,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:15:03,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=347053.3333333333, ans=0.125 2023-09-29 11:15:06,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:15:07,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:15:09,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:14,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:15:14,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 11:15:14,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:15:17,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:22,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:26,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:28,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:28,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=347120.0, ans=0.125 2023-09-29 11:15:29,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:15:29,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:15:31,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:33,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:35,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:38,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:15:38,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=347186.6666666667, ans=0.125 2023-09-29 11:15:39,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:15:41,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 11:15:43,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 11:15:43,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:45,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:45,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:46,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:15:46,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:48,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:50,679 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.84 vs. limit=12.0 2023-09-29 11:15:51,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:15:51,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:15:56,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:15:58,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:15:59,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 11:15:59,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:16:01,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 11:16:03,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:16:05,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:16:06,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:06,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:16:10,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 11:16:11,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:16:11,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:16:14,176 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.31 vs. limit=22.5 2023-09-29 11:16:15,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:17,514 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.84 vs. limit=15.0 2023-09-29 11:16:18,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:21,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:16:22,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:16:24,807 INFO [train.py:1039] (0/4) Epoch 10, batch 4300, loss[loss=0.214, simple_loss=0.2891, pruned_loss=0.06946, over 24030.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2733, pruned_loss=0.06794, over 4705993.40 frames. ], batch size: 86, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:16:24,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:26,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:16:27,854 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.974e+02 2.213e+02 2.611e+02 3.799e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 11:16:27,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:16:27,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 11:16:29,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:32,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:34,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:16:38,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:45,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:45,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 11:16:48,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:16:49,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:16:49,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:16:49,892 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 11:16:50,753 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.05 vs. limit=10.0 2023-09-29 11:16:54,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:16:56,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:16:59,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 11:16:59,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:16:59,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 11:17:02,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:17:04,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:17:07,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:17:07,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:17:09,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:17:11,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:11,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:17:13,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 11:17:13,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 11:17:15,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:17:18,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:17:18,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:18,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 11:17:18,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 11:17:20,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 11:17:20,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:20,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 11:17:21,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 11:17:25,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:27,078 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 11:17:28,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:17:28,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:28,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:31,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 11:17:33,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:17:33,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:33,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:17:33,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:33,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:17:35,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:17:36,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=347653.3333333333, ans=0.0 2023-09-29 11:17:38,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:40,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:40,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:40,666 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=347653.3333333333, ans=0.125 2023-09-29 11:17:42,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=347653.3333333333, ans=0.125 2023-09-29 11:17:47,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 11:17:48,397 INFO [train.py:1039] (0/4) Epoch 10, batch 4350, loss[loss=0.1841, simple_loss=0.2549, pruned_loss=0.0567, over 24327.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2741, pruned_loss=0.06797, over 4706950.74 frames. ], batch size: 56, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:17:48,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:17:51,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=347720.0, ans=0.0 2023-09-29 11:17:53,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:57,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:59,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:17:59,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:18:05,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:18:08,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:18:08,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=347786.6666666667, ans=0.125 2023-09-29 11:18:10,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=347786.6666666667, ans=0.0 2023-09-29 11:18:11,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:18:11,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:16,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:18:19,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:18:21,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:18:22,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=347853.3333333333, ans=0.125 2023-09-29 11:18:27,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 11:18:28,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:28,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:35,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:36,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 11:18:39,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:41,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:18:42,155 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.96 vs. limit=12.0 2023-09-29 11:18:45,970 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 11:18:46,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:47,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:18:47,657 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 11:18:49,117 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 11:18:49,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:50,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:50,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:18:50,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:52,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:52,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:56,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 11:18:56,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:56,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:56,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:56,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 11:18:58,681 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 11:18:59,919 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 11:18:59,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 11:19:01,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=347986.6666666667, ans=0.0 2023-09-29 11:19:03,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:19:04,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:19:04,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:06,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:19:06,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=347986.6666666667, ans=0.1 2023-09-29 11:19:06,947 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.34 vs. limit=22.5 2023-09-29 11:19:08,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 11:19:09,553 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 11:19:09,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:10,997 INFO [train.py:1039] (0/4) Epoch 10, batch 4400, loss[loss=0.1834, simple_loss=0.2583, pruned_loss=0.05419, over 24474.00 frames. ], tot_loss[loss=0.2063, simple_loss=0.2749, pruned_loss=0.06882, over 4695323.11 frames. ], batch size: 58, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:19:14,005 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.070e+02 2.289e+02 2.714e+02 5.548e+02, threshold=4.577e+02, percent-clipped=2.0 2023-09-29 11:19:14,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:14,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:17,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:19:18,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 11:19:18,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 11:19:20,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 11:19:20,289 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 11:19:20,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=348053.3333333333, ans=0.125 2023-09-29 11:19:21,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:19:21,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:24,053 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.23 vs. limit=22.5 2023-09-29 11:19:24,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 11:19:26,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:28,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:28,846 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 11:19:30,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:30,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 11:19:30,602 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 11:19:35,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 11:19:36,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 11:19:37,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 11:19:37,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:38,017 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-09-29 11:19:38,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:38,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:40,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:19:42,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 11:19:42,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 11:19:43,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:46,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:19:46,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:48,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:48,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:48,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 11:19:48,515 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 11:19:49,107 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.00 vs. limit=10.0 2023-09-29 11:19:51,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:58,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:20:01,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 11:20:04,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:20:06,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:10,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:20:10,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 11:20:10,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:20:10,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:20:10,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:20:11,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:20:15,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=348320.0, ans=0.0 2023-09-29 11:20:15,655 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.29 vs. limit=15.0 2023-09-29 11:20:16,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 11:20:17,155 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=348320.0, ans=0.1 2023-09-29 11:20:18,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 11:20:19,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 11:20:19,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:19,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 11:20:21,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:20:25,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:20:28,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 11:20:32,542 INFO [train.py:1039] (0/4) Epoch 10, batch 4450, loss[loss=0.2084, simple_loss=0.2867, pruned_loss=0.06503, over 24372.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2768, pruned_loss=0.06965, over 4684389.90 frames. ], batch size: 77, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:20:32,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:35,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:35,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=348386.6666666667, ans=0.0 2023-09-29 11:20:36,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:20:44,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:20:44,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:20:49,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:52,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:20:55,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:20:55,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:56,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 11:20:57,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:20:59,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:59,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:20:59,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:21:02,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:21:06,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:06,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:09,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:21:09,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:21:10,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:21:15,189 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.70 vs. limit=15.0 2023-09-29 11:21:15,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:21:17,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 11:21:17,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 11:21:17,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:21:20,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=348586.6666666667, ans=0.1 2023-09-29 11:21:21,126 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=348586.6666666667, ans=0.1 2023-09-29 11:21:22,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:23,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 11:21:26,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:21:32,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:33,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 11:21:33,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:33,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:33,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:21:33,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:35,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:37,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=348653.3333333333, ans=0.125 2023-09-29 11:21:38,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:21:39,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 11:21:41,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:21:43,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:21:43,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:47,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:47,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:21:48,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:21:51,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 11:21:53,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:21:54,826 INFO [train.py:1039] (0/4) Epoch 10, batch 4500, loss[loss=0.2059, simple_loss=0.2572, pruned_loss=0.07724, over 22624.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2759, pruned_loss=0.06892, over 4694438.94 frames. ], batch size: 322, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:21:57,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=348720.0, ans=0.1 2023-09-29 11:21:58,627 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.956e+02 2.459e+02 2.945e+02 4.663e+02, threshold=4.917e+02, percent-clipped=1.0 2023-09-29 11:21:58,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:22:00,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 11:22:00,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 11:22:01,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:03,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=348720.0, ans=0.1 2023-09-29 11:22:08,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:22:08,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:22:09,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:22:10,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:22:10,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:11,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:17,112 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=348786.6666666667, ans=0.125 2023-09-29 11:22:19,259 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.82 vs. limit=15.0 2023-09-29 11:22:22,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:23,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:22:24,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=348786.6666666667, ans=0.2 2023-09-29 11:22:25,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:27,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:22:28,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:22:35,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:22:40,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:22:45,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:22:49,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:22:50,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 11:22:52,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:22:52,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:53,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:55,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:56,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:56,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 11:22:56,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:22:56,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:03,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:23:03,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:23:07,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:10,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:23:10,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:23:11,249 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=348986.6666666667, ans=0.2 2023-09-29 11:23:12,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 11:23:12,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 11:23:12,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 11:23:17,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 11:23:18,492 INFO [train.py:1039] (0/4) Epoch 10, batch 4550, loss[loss=0.2059, simple_loss=0.2448, pruned_loss=0.08354, over 19441.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2748, pruned_loss=0.06846, over 4680846.89 frames. ], batch size: 389, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:23:20,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 11:23:20,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:23,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:25,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:27,241 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=349053.3333333333, ans=0.125 2023-09-29 11:23:28,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:32,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:23:35,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:23:38,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:23:38,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:23:38,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:41,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:43,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:45,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:23:49,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 11:23:50,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 11:23:52,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:23:53,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 11:23:56,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 11:23:59,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:01,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=349186.6666666667, ans=0.125 2023-09-29 11:24:01,415 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.23 vs. limit=15.0 2023-09-29 11:24:02,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 11:24:05,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:24:08,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:24:10,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 11:24:13,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:15,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:15,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:16,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:16,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 11:24:17,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 11:24:19,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:24:19,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 11:24:20,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 11:24:22,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:23,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:23,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:25,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:25,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:24:27,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:24:28,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 11:24:30,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:30,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:24:32,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 11:24:32,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:24:32,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 11:24:35,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:24:35,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:24:37,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:24:37,698 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.02 vs. limit=15.0 2023-09-29 11:24:38,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:38,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:24:41,253 INFO [train.py:1039] (0/4) Epoch 10, batch 4600, loss[loss=0.1828, simple_loss=0.218, pruned_loss=0.07379, over 19438.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2733, pruned_loss=0.0679, over 4685552.11 frames. ], batch size: 389, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:24:41,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:24:42,258 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.97 vs. limit=15.0 2023-09-29 11:24:44,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:24:46,325 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.915e+02 2.143e+02 2.405e+02 4.065e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-29 11:24:48,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:48,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:51,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:24:51,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:24:52,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:24:53,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 11:24:55,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:24:59,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:24:59,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:01,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:09,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 11:25:10,230 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.69 vs. limit=22.5 2023-09-29 11:25:12,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:15,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:17,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:25:17,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:24,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 11:25:24,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:25:24,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:25:29,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:29,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:25:31,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:25:32,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=349586.6666666667, ans=0.1 2023-09-29 11:25:37,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 11:25:38,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:25:42,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:45,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:25:45,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=349653.3333333333, ans=0.0 2023-09-29 11:25:48,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:48,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 11:25:48,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:48,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 11:25:50,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:50,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:50,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=349653.3333333333, ans=0.2 2023-09-29 11:25:51,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:51,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:53,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:53,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 11:25:55,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 11:25:55,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 11:25:55,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:56,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:25:58,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:59,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:26:03,469 INFO [train.py:1039] (0/4) Epoch 10, batch 4650, loss[loss=0.1876, simple_loss=0.2775, pruned_loss=0.04879, over 24306.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.2732, pruned_loss=0.0675, over 4696730.36 frames. ], batch size: 74, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:26:03,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=349720.0, ans=0.1 2023-09-29 11:26:09,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:26:12,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:12,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:12,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:26:14,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:26:14,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:14,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:16,786 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=349720.0, ans=0.0 2023-09-29 11:26:17,260 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.25 vs. limit=15.0 2023-09-29 11:26:18,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 11:26:21,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:26:22,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 11:26:24,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:24,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 11:26:24,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:26:25,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 11:26:25,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 11:26:25,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:27,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:26:31,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:26:33,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:33,625 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 11:26:34,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=349786.6666666667, ans=15.0 2023-09-29 11:26:35,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:36,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 11:26:39,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:40,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:26:41,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 11:26:42,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:26:45,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:26:52,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:55,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:55,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=349920.0, ans=0.0 2023-09-29 11:26:58,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:00,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:02,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:27:02,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 11:27:03,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 11:27:03,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 11:27:03,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 11:27:06,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:12,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:27:13,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:14,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 11:27:14,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:16,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:16,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:27:17,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:27:20,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:27:20,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:22,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:25,769 INFO [train.py:1039] (0/4) Epoch 10, batch 4700, loss[loss=0.1812, simple_loss=0.2497, pruned_loss=0.05641, over 24307.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2737, pruned_loss=0.06765, over 4695286.93 frames. ], batch size: 56, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:27:25,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:26,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:27:26,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:27:27,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 11:27:27,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:27:29,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 11:27:30,693 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.991e+02 2.178e+02 2.659e+02 4.780e+02, threshold=4.356e+02, percent-clipped=1.0 2023-09-29 11:27:39,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:40,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:40,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:27:43,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:44,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:27:49,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 11:27:49,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 11:27:52,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:52,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=350120.0, ans=0.0 2023-09-29 11:27:54,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:27:54,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:58,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:03,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=350186.6666666667, ans=0.125 2023-09-29 11:28:06,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:28:08,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:28:10,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:28:17,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 11:28:18,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:28:20,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:24,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 11:28:27,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:28:32,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:28:33,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 11:28:33,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:33,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:36,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:36,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:28:36,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 11:28:36,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=350320.0, ans=0.0 2023-09-29 11:28:38,952 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 11:28:39,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:39,713 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.65 vs. limit=22.5 2023-09-29 11:28:40,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 11:28:42,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:43,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=350320.0, ans=0.1 2023-09-29 11:28:47,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 11:28:49,339 INFO [train.py:1039] (0/4) Epoch 10, batch 4750, loss[loss=0.288, simple_loss=0.3256, pruned_loss=0.1253, over 19703.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2744, pruned_loss=0.06777, over 4702203.50 frames. ], batch size: 388, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:28:49,703 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=350386.6666666667, ans=0.0 2023-09-29 11:28:51,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:28:52,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:28:59,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 11:28:59,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:01,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=350386.6666666667, ans=0.0 2023-09-29 11:29:03,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 11:29:05,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:29:05,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:29:05,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:12,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 11:29:18,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:29:19,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 11:29:19,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:22,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:22,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:24,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:25,129 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 11:29:25,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 11:29:30,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 11:29:34,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:36,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:29:39,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:29:39,116 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 11:29:39,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:29:42,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:29:45,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:29:47,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 11:29:47,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 11:29:47,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:48,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:29:48,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:50,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:29:50,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 11:29:55,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 11:29:57,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:01,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:30:01,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 11:30:02,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:04,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:04,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:30:06,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:06,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:30:09,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:09,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 11:30:09,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 11:30:11,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 11:30:12,439 INFO [train.py:1039] (0/4) Epoch 10, batch 4800, loss[loss=0.2324, simple_loss=0.2922, pruned_loss=0.08626, over 22830.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.275, pruned_loss=0.0683, over 4702539.14 frames. ], batch size: 323, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:30:15,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:30:15,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:16,722 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.911e+02 2.173e+02 2.490e+02 3.366e+02, threshold=4.345e+02, percent-clipped=0.0 2023-09-29 11:30:16,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 11:30:22,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:23,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:28,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:30:28,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:30,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:30,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 11:30:31,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:31,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:30:35,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:30:39,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:30:40,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:42,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:30:43,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:43,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:30:44,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:45,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:47,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=350853.3333333333, ans=0.125 2023-09-29 11:30:49,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:51,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:53,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:53,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:30:55,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:30:56,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:58,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 11:30:58,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 11:30:58,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=350853.3333333333, ans=0.0 2023-09-29 11:30:59,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:59,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:30:59,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:30:59,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:02,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:31:05,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:31:05,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:09,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:11,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:13,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:18,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 11:31:20,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:21,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:21,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:31:23,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:26,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:28,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:31:28,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:28,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:31:29,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:31:30,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:31:33,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:33,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:33,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:34,448 INFO [train.py:1039] (0/4) Epoch 10, batch 4850, loss[loss=0.1941, simple_loss=0.2763, pruned_loss=0.05593, over 23297.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2757, pruned_loss=0.06865, over 4696624.62 frames. ], batch size: 105, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:31:34,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 11:31:37,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 11:31:37,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:31:37,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:41,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:43,196 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=351053.3333333333, ans=0.125 2023-09-29 11:31:45,114 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.39 vs. limit=15.0 2023-09-29 11:31:48,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=351053.3333333333, ans=0.05 2023-09-29 11:31:51,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 11:31:51,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:56,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:31:57,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:31:57,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:58,077 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:32:01,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:32:01,427 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=351120.0, ans=0.1 2023-09-29 11:32:02,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:32:04,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:32:04,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 11:32:07,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:32:10,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:32:11,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:32:11,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:32:11,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 11:32:14,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:32:15,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:18,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:18,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 11:32:21,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 11:32:23,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:32:29,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:32:30,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 11:32:32,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:32:32,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:32:34,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:32:36,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 11:32:36,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:36,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 11:32:37,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:32:37,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:32:39,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 11:32:48,399 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=12.0 2023-09-29 11:32:49,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:54,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:32:54,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:32:57,375 INFO [train.py:1039] (0/4) Epoch 10, batch 4900, loss[loss=0.1903, simple_loss=0.2435, pruned_loss=0.06858, over 22649.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2746, pruned_loss=0.06805, over 4706141.89 frames. ], batch size: 322, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:33:00,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 11:33:00,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:33:02,130 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 2.045e+02 2.293e+02 2.550e+02 3.770e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 11:33:06,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:08,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:09,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:33:12,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 11:33:17,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 11:33:22,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 11:33:22,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=351453.3333333333, ans=0.1 2023-09-29 11:33:23,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 11:33:23,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:23,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:25,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:33:25,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:25,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:33:25,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 11:33:30,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 11:33:30,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:33:32,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:33:34,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:35,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:33:35,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:37,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:37,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 11:33:38,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:33:42,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:42,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 11:33:42,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 11:33:45,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 11:33:47,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:33:47,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=351586.6666666667, ans=0.0 2023-09-29 11:33:49,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=351586.6666666667, ans=0.125 2023-09-29 11:33:50,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:33:50,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:33:52,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:52,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:33:52,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:33:52,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 11:33:55,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:56,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:33:58,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:34:01,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 11:34:03,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:34:03,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 11:34:04,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 11:34:11,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:13,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:14,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 11:34:14,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=351653.3333333333, ans=0.0 2023-09-29 11:34:15,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:15,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:34:17,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:20,042 INFO [train.py:1039] (0/4) Epoch 10, batch 4950, loss[loss=0.177, simple_loss=0.2582, pruned_loss=0.04791, over 24559.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2726, pruned_loss=0.06712, over 4710997.86 frames. ], batch size: 60, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:34:20,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:20,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:34:20,376 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=351720.0, ans=0.125 2023-09-29 11:34:22,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:22,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:34:23,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:34:25,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:25,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:28,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 11:34:30,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 11:34:30,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:34:31,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 11:34:31,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:31,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:34:31,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:34:33,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:34,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:36,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:34:37,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:34:37,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:39,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:41,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:44,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:34:48,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:49,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:52,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:52,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:54,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:34:56,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 11:34:57,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 11:34:59,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:00,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:35:00,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:35:02,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=351853.3333333333, ans=0.2 2023-09-29 11:35:03,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:03,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:35:05,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:35:08,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:10,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:35:11,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:35:13,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:13,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:15,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 11:35:15,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:35:17,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:35:21,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:35:23,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:35:23,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:35:23,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:24,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:35:26,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:35:28,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:35:28,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:35:30,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:31,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 11:35:31,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=351986.6666666667, ans=0.125 2023-09-29 11:35:36,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:35:41,004 INFO [train.py:1039] (0/4) Epoch 10, batch 5000, loss[loss=0.2003, simple_loss=0.2603, pruned_loss=0.07016, over 23612.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2718, pruned_loss=0.06632, over 4711852.41 frames. ], batch size: 256, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:35:41,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 11:35:41,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:35:45,997 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.45 vs. limit=15.0 2023-09-29 11:35:46,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:47,803 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.017e+02 2.302e+02 2.737e+02 4.823e+02, threshold=4.603e+02, percent-clipped=1.0 2023-09-29 11:35:48,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:35:49,321 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.22 vs. limit=22.5 2023-09-29 11:35:49,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 11:35:51,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 11:35:51,839 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=352053.3333333333, ans=0.2 2023-09-29 11:35:53,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:35:54,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 11:35:54,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:54,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:35:56,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 11:35:57,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:59,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:01,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 11:36:01,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:01,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:02,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 11:36:03,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 11:36:03,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:36:03,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 11:36:03,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:36:03,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:04,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:36:04,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 11:36:04,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 11:36:06,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 11:36:06,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:07,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:09,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 11:36:09,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:36:11,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:12,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:12,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:36:14,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 11:36:15,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:36:15,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:36:21,594 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 11:36:24,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:26,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:26,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:32,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 11:36:32,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:32,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:32,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:36:34,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 11:36:35,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:37,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:38,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:36:39,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=352253.3333333333, ans=0.125 2023-09-29 11:36:39,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=352253.3333333333, ans=0.0 2023-09-29 11:36:44,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 11:36:49,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:58,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:37:00,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:00,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:37:00,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:01,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:37:01,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:37:03,147 INFO [train.py:1039] (0/4) Epoch 10, batch 5050, loss[loss=0.2121, simple_loss=0.2891, pruned_loss=0.06757, over 24561.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2727, pruned_loss=0.06652, over 4712056.65 frames. ], batch size: 71, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:37:03,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:08,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:08,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 11:37:08,930 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=352386.6666666667, ans=0.125 2023-09-29 11:37:10,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:37:12,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=352386.6666666667, ans=0.2 2023-09-29 11:37:13,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:16,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:37:16,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 11:37:17,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:17,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:37:20,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:37:20,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=352453.3333333333, ans=0.1 2023-09-29 11:37:21,441 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.69 vs. limit=22.5 2023-09-29 11:37:22,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:37:22,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:37:24,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=352453.3333333333, ans=0.1 2023-09-29 11:37:34,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 11:37:36,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:37:36,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:37:36,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 11:37:36,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:37:38,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=352520.0, ans=0.125 2023-09-29 11:37:39,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:39,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:41,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:37:41,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 11:37:42,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 11:37:44,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:46,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:37:49,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:51,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 11:37:52,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:37:55,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 11:37:57,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:37:57,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:37:58,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:37:59,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:38:00,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:02,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:38:03,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:03,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:38:03,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:38:05,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 11:38:07,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:38:09,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:38:12,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:38:12,620 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 11:38:12,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:38:14,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:14,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:14,871 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 11:38:16,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=352653.3333333333, ans=0.2 2023-09-29 11:38:17,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:17,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 11:38:17,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:21,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:23,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:23,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 11:38:23,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=352653.3333333333, ans=0.125 2023-09-29 11:38:24,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 11:38:26,298 INFO [train.py:1039] (0/4) Epoch 10, batch 5100, loss[loss=0.2314, simple_loss=0.2844, pruned_loss=0.08922, over 23806.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2737, pruned_loss=0.0665, over 4731338.95 frames. ], batch size: 212, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:38:26,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:27,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:38:28,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:38:31,038 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 11:38:32,347 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.939e+02 2.293e+02 2.682e+02 4.893e+02, threshold=4.586e+02, percent-clipped=1.0 2023-09-29 11:38:33,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:37,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 11:38:39,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 11:38:41,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:42,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:44,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:44,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 11:38:44,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 11:38:50,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:50,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:38:53,686 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.65 vs. limit=15.0 2023-09-29 11:38:57,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:57,783 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=352853.3333333333, ans=0.2 2023-09-29 11:38:59,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 11:38:59,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:01,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:39:02,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:39:05,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 11:39:07,639 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 11:39:09,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:09,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 11:39:09,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 11:39:15,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:21,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:22,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 11:39:23,004 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 11:39:23,018 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 11:39:24,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 11:39:24,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:29,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 11:39:33,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 11:39:37,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:39:39,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:39:41,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 11:39:44,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:39:44,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 11:39:49,542 INFO [train.py:1039] (0/4) Epoch 10, batch 5150, loss[loss=0.2086, simple_loss=0.2811, pruned_loss=0.06806, over 23226.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2746, pruned_loss=0.06715, over 4720892.24 frames. ], batch size: 93, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:39:49,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:39:50,500 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.86 vs. limit=10.0 2023-09-29 11:39:51,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:39:51,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:39:51,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:39:51,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:39:51,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:39:52,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 11:39:52,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 11:39:54,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 11:39:54,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:39:54,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 11:39:55,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:55,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:39:57,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:39:58,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:04,155 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=353120.0, ans=0.125 2023-09-29 11:40:06,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:40:06,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 11:40:07,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:07,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:40:07,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:40:07,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:08,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=353120.0, ans=0.0 2023-09-29 11:40:09,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:09,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:40:09,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:40:10,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 11:40:11,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:40:12,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:12,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:40:15,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 11:40:16,169 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.60 vs. limit=15.0 2023-09-29 11:40:17,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:40:24,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:40:24,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 11:40:26,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=353186.6666666667, ans=0.1 2023-09-29 11:40:28,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:34,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:34,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=353186.6666666667, ans=0.125 2023-09-29 11:40:35,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:39,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:41,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:44,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 11:40:49,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:51,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:40:51,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:54,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:55,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:57,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 11:41:00,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:02,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:41:04,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:41:04,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:41:05,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:41:05,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:41:05,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:41:05,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:41:06,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=353320.0, ans=0.1 2023-09-29 11:41:09,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:41:10,325 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.60 vs. limit=22.5 2023-09-29 11:41:11,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:41:13,168 INFO [train.py:1039] (0/4) Epoch 10, batch 5200, loss[loss=0.2169, simple_loss=0.2755, pruned_loss=0.07914, over 23848.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2753, pruned_loss=0.06769, over 4723072.59 frames. ], batch size: 195, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:41:14,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:19,181 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.032e+02 2.395e+02 2.917e+02 4.034e+02, threshold=4.790e+02, percent-clipped=0.0 2023-09-29 11:41:19,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 11:41:19,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:41:20,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:24,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:24,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=353386.6666666667, ans=0.5 2023-09-29 11:41:26,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:41:26,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:27,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 11:41:27,915 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=353453.3333333333, ans=0.125 2023-09-29 11:41:30,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:41:32,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:35,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 11:41:38,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:41:38,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=353453.3333333333, ans=0.1 2023-09-29 11:41:39,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:41:41,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 11:41:41,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 11:41:44,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 11:41:44,696 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=353520.0, ans=0.125 2023-09-29 11:41:46,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:46,386 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 11:41:46,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:47,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:48,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:41:48,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 11:41:49,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:41:53,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:56,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 11:41:56,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 11:41:56,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 11:42:03,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 11:42:04,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:42:09,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:42:09,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:10,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 11:42:11,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:42:12,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 11:42:12,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:12,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:15,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:17,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:42:20,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:42:22,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:22,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:25,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=353653.3333333333, ans=0.125 2023-09-29 11:42:28,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:30,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 11:42:32,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:32,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:42:32,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=353720.0, ans=0.0 2023-09-29 11:42:34,453 INFO [train.py:1039] (0/4) Epoch 10, batch 5250, loss[loss=0.2089, simple_loss=0.2704, pruned_loss=0.07375, over 23739.00 frames. ], tot_loss[loss=0.2048, simple_loss=0.2744, pruned_loss=0.06762, over 4711201.55 frames. ], batch size: 149, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:42:34,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:36,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:42:36,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:42:37,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=353720.0, ans=0.125 2023-09-29 11:42:40,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:42:40,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=353720.0, ans=0.125 2023-09-29 11:42:43,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:45,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:42:45,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:42:50,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:50,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=353786.6666666667, ans=0.125 2023-09-29 11:42:51,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:42:56,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:42:58,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:58,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 11:42:58,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:58,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=353786.6666666667, ans=0.1 2023-09-29 11:42:59,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:43:23,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=353920.0, ans=0.125 2023-09-29 11:43:38,295 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.93 vs. limit=15.0 2023-09-29 11:43:48,668 INFO [train.py:1039] (0/4) Epoch 10, batch 5300, loss[loss=0.21, simple_loss=0.2811, pruned_loss=0.0694, over 23416.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2741, pruned_loss=0.06756, over 4731945.50 frames. ], batch size: 93, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:43:53,275 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=354053.3333333333, ans=0.0 2023-09-29 11:43:54,367 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.989e+02 2.153e+02 2.436e+02 4.114e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 11:44:01,447 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=354120.0, ans=0.1 2023-09-29 11:44:04,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=354120.0, ans=0.125 2023-09-29 11:44:05,139 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-10.pt 2023-09-29 11:44:10,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:44:10,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 11:44:10,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 11:44:10,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:11,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:11,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:11,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:11,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:11,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:11,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:11,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:44:11,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:44:11,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 11:44:12,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 11:44:12,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 11:44:12,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:44:12,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 11:44:12,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 11:44:12,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:13,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:13,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:13,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:13,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:44:14,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:14,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:14,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:14,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:14,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:14,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:44:14,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:14,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:44:15,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 11:44:15,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:16,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:16,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 11:44:16,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 11:44:16,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:44:16,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:16,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 11:44:16,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 11:44:16,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:17,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:44:18,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:18,267 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 11:44:18,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 11:44:18,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:44:18,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:18,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 11:44:18,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 11:44:18,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 11:44:19,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:22,141 INFO [train.py:1039] (0/4) Epoch 11, batch 0, loss[loss=0.1966, simple_loss=0.2803, pruned_loss=0.05647, over 24671.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2803, pruned_loss=0.05647, over 24671.00 frames. ], batch size: 73, lr: 9.67e-03, grad_scale: 32.0 2023-09-29 11:44:22,142 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 11:44:36,232 INFO [train.py:1071] (0/4) Epoch 11, validation: loss=0.3103, simple_loss=0.2886, pruned_loss=0.166, over 1125622.00 frames. 2023-09-29 11:44:36,233 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 11:44:38,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 11:44:38,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:44:42,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:44:48,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:48,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:44:48,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:48,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 11:44:50,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 11:44:53,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:54,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:57,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:57,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:59,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:44:59,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:00,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 11:45:02,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:08,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=354273.3333333333, ans=0.125 2023-09-29 11:45:11,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:45:11,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:13,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 11:45:18,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:45:18,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:45:20,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:22,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=354273.3333333333, ans=0.0 2023-09-29 11:45:24,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=354340.0, ans=0.0 2023-09-29 11:45:26,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:45:27,577 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.11 vs. limit=15.0 2023-09-29 11:45:33,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:36,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 11:45:39,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 11:45:41,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:45:41,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:42,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:45:42,868 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=354406.6666666667, ans=0.025 2023-09-29 11:45:44,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:45,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 11:45:49,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:50,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:54,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:45:57,370 INFO [train.py:1039] (0/4) Epoch 11, batch 50, loss[loss=0.1957, simple_loss=0.2655, pruned_loss=0.06298, over 23648.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2746, pruned_loss=0.06485, over 1078854.53 frames. ], batch size: 149, lr: 9.67e-03, grad_scale: 16.0 2023-09-29 11:45:57,556 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 11:46:01,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:46:02,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:05,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:05,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=354473.3333333333, ans=0.1 2023-09-29 11:46:07,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 11:46:07,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:46:07,481 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=354473.3333333333, ans=0.0 2023-09-29 11:46:08,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:46:09,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:11,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:14,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:17,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 11:46:17,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:24,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:46:25,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=354540.0, ans=0.125 2023-09-29 11:46:28,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 11:46:30,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 11:46:32,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:46:33,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:46:33,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:33,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:46:35,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:46:35,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:46:35,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:43,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:46:45,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:46:45,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:46:46,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 11:46:48,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:46:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:46:49,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 11:46:50,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:51,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 11:47:01,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:01,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:47:01,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=354740.0, ans=0.2 2023-09-29 11:47:02,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:02,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=354740.0, ans=0.0 2023-09-29 11:47:04,546 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.921e+02 2.105e+02 2.466e+02 3.711e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 11:47:04,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:04,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:07,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 11:47:07,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 11:47:09,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:09,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:11,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:47:12,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:47:12,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 11:47:14,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 11:47:14,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:47:15,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:17,321 INFO [train.py:1039] (0/4) Epoch 11, batch 100, loss[loss=0.2116, simple_loss=0.2731, pruned_loss=0.07502, over 23674.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2768, pruned_loss=0.06724, over 1884027.28 frames. ], batch size: 232, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:47:17,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:47:18,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 11:47:18,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 11:47:19,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:20,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:22,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:47:22,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:47:26,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:47:28,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:47:28,990 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.08 vs. limit=15.0 2023-09-29 11:47:31,870 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.57 vs. limit=22.5 2023-09-29 11:47:32,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:34,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 11:47:34,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:36,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:47:36,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:39,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:39,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:39,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:40,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 11:47:42,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:47:42,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:42,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:42,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:47,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 11:47:47,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:49,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:49,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:47:51,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:47:54,466 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 11:47:54,490 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 11:47:56,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:47:56,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:48:00,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:48:03,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:48:05,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:09,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:11,183 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 11:48:12,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:48:17,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:17,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:48:19,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:21,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=355006.6666666667, ans=0.0 2023-09-29 11:48:22,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:25,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:27,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:48:30,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:31,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:33,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:33,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:48:33,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:34,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 11:48:34,655 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 11:48:34,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:36,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:48:37,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:37,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:37,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:48:37,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:48:37,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:48:37,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:39,662 INFO [train.py:1039] (0/4) Epoch 11, batch 150, loss[loss=0.1989, simple_loss=0.2643, pruned_loss=0.06677, over 23619.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2793, pruned_loss=0.07054, over 2490848.21 frames. ], batch size: 232, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:48:39,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:41,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:43,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:48:43,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:48:45,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:45,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=355140.0, ans=0.1 2023-09-29 11:48:47,748 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.96 vs. limit=22.5 2023-09-29 11:48:48,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:48,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:48:48,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:52,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:53,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:56,966 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:48:58,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:59,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:02,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 11:49:02,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 11:49:02,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 11:49:07,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:49:07,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:49:07,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:49:07,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=355206.6666666667, ans=0.1 2023-09-29 11:49:08,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:49:08,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:10,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:10,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:11,833 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 11:49:13,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:20,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:24,436 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.31 vs. limit=15.0 2023-09-29 11:49:25,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:49:25,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=355273.3333333333, ans=0.125 2023-09-29 11:49:26,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 11:49:29,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:49:29,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:29,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:30,601 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.41 vs. limit=6.0 2023-09-29 11:49:33,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:49:35,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:49:36,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:49:36,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:36,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 11:49:41,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:42,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:49:42,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:49:42,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:49:43,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=355406.6666666667, ans=0.0 2023-09-29 11:49:44,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:46,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 11:49:47,845 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.912e+02 2.159e+02 2.654e+02 4.388e+02, threshold=4.317e+02, percent-clipped=1.0 2023-09-29 11:49:48,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:49:50,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:49:54,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:49:55,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:49:55,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 11:49:57,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:57,082 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 11:50:01,511 INFO [train.py:1039] (0/4) Epoch 11, batch 200, loss[loss=0.2094, simple_loss=0.2884, pruned_loss=0.06526, over 24669.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2803, pruned_loss=0.07131, over 2978872.82 frames. ], batch size: 73, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:50:01,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:05,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:50:05,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:50:08,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 11:50:09,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:09,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:13,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 11:50:14,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:50:16,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:17,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:20,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:50:22,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:22,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:43,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:50:43,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:50:44,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:50:44,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:50:46,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:50:46,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:50:47,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:49,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:50:50,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:50,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:50:51,065 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=355673.3333333333, ans=0.0 2023-09-29 11:50:52,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 11:50:52,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=355673.3333333333, ans=0.0 2023-09-29 11:50:53,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:50:53,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:54,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=355673.3333333333, ans=0.0 2023-09-29 11:51:00,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:51:05,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:51:12,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:14,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:51:16,462 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.32 vs. limit=15.0 2023-09-29 11:51:22,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:23,803 INFO [train.py:1039] (0/4) Epoch 11, batch 250, loss[loss=0.1971, simple_loss=0.2566, pruned_loss=0.06883, over 23698.00 frames. ], tot_loss[loss=0.211, simple_loss=0.2794, pruned_loss=0.07127, over 3348881.80 frames. ], batch size: 232, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:51:24,777 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.48 vs. limit=15.0 2023-09-29 11:51:25,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 11:51:25,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:25,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:51:25,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:26,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:51:28,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 11:51:29,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:51:29,962 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 11:51:30,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=355806.6666666667, ans=0.125 2023-09-29 11:51:31,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:33,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:51:33,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:35,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:37,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:51:37,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:40,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:51:44,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:51:53,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:51:57,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:57,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:52:00,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=355940.0, ans=0.125 2023-09-29 11:52:03,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:52:05,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:52:07,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:52:07,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:07,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:52:07,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:52:07,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:10,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:52:12,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 11:52:12,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:52:12,769 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=356006.6666666667, ans=0.125 2023-09-29 11:52:13,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=15.0 2023-09-29 11:52:15,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:52:15,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:52:15,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:52:17,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:19,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:52:19,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:52:19,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=356006.6666666667, ans=0.035 2023-09-29 11:52:19,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=356006.6666666667, ans=0.2 2023-09-29 11:52:20,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:23,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:52:23,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:27,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:52:30,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:32,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:52:33,628 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.944e+02 2.181e+02 2.498e+02 3.489e+02, threshold=4.363e+02, percent-clipped=0.0 2023-09-29 11:52:38,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:41,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:52:45,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 11:52:46,481 INFO [train.py:1039] (0/4) Epoch 11, batch 300, loss[loss=0.2173, simple_loss=0.268, pruned_loss=0.08324, over 23791.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2766, pruned_loss=0.06981, over 3643839.59 frames. ], batch size: 212, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:52:46,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:52:46,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:48,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 11:52:48,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:52:49,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:52:49,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 11:52:53,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:55,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:53:00,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:53:00,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 11:53:01,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:53:03,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:53:03,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 11:53:03,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:08,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:53:11,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:53:11,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 11:53:15,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 11:53:15,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:18,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:20,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:20,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 11:53:20,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:53:23,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:53:25,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:53:26,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:53:33,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:53:33,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 11:53:33,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:53:36,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:37,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 11:53:38,123 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=356340.0, ans=0.1 2023-09-29 11:53:39,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:53:42,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:53:47,088 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.35 vs. limit=15.0 2023-09-29 11:53:47,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:53:47,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 11:53:51,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:51,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:53:54,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:56,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:53:56,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 11:53:57,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:53:59,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:00,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 11:54:02,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:54:03,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:05,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:05,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:06,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:09,944 INFO [train.py:1039] (0/4) Epoch 11, batch 350, loss[loss=0.2034, simple_loss=0.2689, pruned_loss=0.06898, over 23794.00 frames. ], tot_loss[loss=0.2063, simple_loss=0.2749, pruned_loss=0.06886, over 3871303.56 frames. ], batch size: 212, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:54:11,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:11,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:54:14,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:19,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:21,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:22,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:27,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 11:54:29,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:29,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 11:54:33,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:33,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 11:54:35,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:37,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 11:54:37,362 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=356540.0, ans=0.0 2023-09-29 11:54:39,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:54:41,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:42,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:54:44,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:54:44,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:46,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:54:47,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:54:47,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:57,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:54:57,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:54:57,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:54:57,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:01,358 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=356673.3333333333, ans=0.2 2023-09-29 11:55:02,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 11:55:02,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:55:07,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=356673.3333333333, ans=0.0 2023-09-29 11:55:08,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:08,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:08,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:55:10,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 11:55:12,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:14,062 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 11:55:16,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 11:55:16,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:19,169 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.993e+02 2.217e+02 2.521e+02 3.405e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 11:55:19,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:55:19,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 11:55:22,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:25,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:55:25,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:26,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:26,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:30,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:32,090 INFO [train.py:1039] (0/4) Epoch 11, batch 400, loss[loss=0.1916, simple_loss=0.273, pruned_loss=0.05511, over 24655.00 frames. ], tot_loss[loss=0.2048, simple_loss=0.2737, pruned_loss=0.06792, over 4058839.34 frames. ], batch size: 68, lr: 9.64e-03, grad_scale: 32.0 2023-09-29 11:55:33,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:55:37,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:55:38,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 11:55:38,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:38,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:41,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:55:42,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:45,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:45,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=356806.6666666667, ans=0.0 2023-09-29 11:55:47,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:49,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 11:55:51,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 11:55:51,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:52,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 11:55:53,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:54,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=356873.3333333333, ans=0.125 2023-09-29 11:55:57,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:55:57,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:57,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 11:55:57,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:55:57,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=356873.3333333333, ans=0.2 2023-09-29 11:55:57,930 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.25 vs. limit=15.0 2023-09-29 11:55:58,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:58,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:56:00,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:56:01,768 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 11:56:01,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 11:56:07,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:56:10,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:10,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 11:56:11,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 11:56:13,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:56:15,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:20,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=357006.6666666667, ans=0.0 2023-09-29 11:56:24,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 11:56:27,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:56:28,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 11:56:30,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:56:31,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:56:31,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 11:56:36,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:56:40,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=357073.3333333333, ans=0.125 2023-09-29 11:56:41,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:56:42,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:45,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:46,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 11:56:48,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:56:49,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 11:56:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:56:52,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:56:53,452 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.05 vs. limit=15.0 2023-09-29 11:56:54,757 INFO [train.py:1039] (0/4) Epoch 11, batch 450, loss[loss=0.1847, simple_loss=0.2465, pruned_loss=0.06148, over 23251.00 frames. ], tot_loss[loss=0.2038, simple_loss=0.273, pruned_loss=0.06725, over 4204188.15 frames. ], batch size: 119, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:56:54,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 11:56:56,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:56:57,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:56:59,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:56:59,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 11:57:01,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:57:01,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:57:01,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:01,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 11:57:03,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:57:03,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:57:06,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:57:16,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:16,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:19,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 11:57:21,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 11:57:24,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:57:26,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:28,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:32,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:34,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:36,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 11:57:37,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 11:57:41,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 11:57:41,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:57:41,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:42,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:57:44,206 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 11:57:44,221 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 11:57:44,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:44,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:57:46,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:57:48,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:57:49,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:51,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 11:57:52,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 11:57:54,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:57,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:57:57,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:58:00,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 11:58:04,477 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.885e+02 2.166e+02 2.451e+02 4.204e+02, threshold=4.332e+02, percent-clipped=0.0 2023-09-29 11:58:04,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:58:06,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 11:58:07,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 11:58:09,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:58:16,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:58:17,677 INFO [train.py:1039] (0/4) Epoch 11, batch 500, loss[loss=0.2179, simple_loss=0.2795, pruned_loss=0.07816, over 22913.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2734, pruned_loss=0.06686, over 4326516.55 frames. ], batch size: 322, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:58:17,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:19,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:58:19,461 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 11:58:24,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:24,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:58:26,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:26,155 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 11:58:27,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 11:58:27,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:29,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=357473.3333333333, ans=0.0 2023-09-29 11:58:30,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:58:31,858 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.40 vs. limit=22.5 2023-09-29 11:58:33,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:58:33,735 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.63 vs. limit=22.5 2023-09-29 11:58:35,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:58:37,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:37,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:39,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:58:49,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:50,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 11:58:50,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:58:50,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:52,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 11:58:52,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:58:55,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:58:55,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:58:57,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:58:57,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:58,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 11:58:59,735 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:59:02,582 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 11:59:06,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:07,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:08,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=357673.3333333333, ans=0.125 2023-09-29 11:59:09,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:59:10,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 11:59:15,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:59:17,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:20,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:24,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:31,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:35,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 11:59:35,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:35,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:38,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 11:59:39,828 INFO [train.py:1039] (0/4) Epoch 11, batch 550, loss[loss=0.2123, simple_loss=0.2893, pruned_loss=0.06768, over 24652.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2752, pruned_loss=0.06802, over 4416032.32 frames. ], batch size: 68, lr: 9.62e-03, grad_scale: 32.0 2023-09-29 11:59:39,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:59:42,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:42,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=357806.6666666667, ans=0.0 2023-09-29 11:59:46,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 11:59:48,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 11:59:48,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:50,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 11:59:50,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:59:50,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:51,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:59:53,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:59:55,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=357873.3333333333, ans=0.125 2023-09-29 11:59:56,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:56,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 11:59:58,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:00:00,435 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:00:03,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:04,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:05,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:06,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:10,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 12:00:10,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 12:00:13,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:00:13,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=357940.0, ans=0.125 2023-09-29 12:00:16,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:00:16,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:17,164 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=357940.0, ans=0.95 2023-09-29 12:00:18,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=357940.0, ans=0.125 2023-09-29 12:00:19,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:00:20,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=357940.0, ans=0.125 2023-09-29 12:00:23,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:23,373 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 12:00:23,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:25,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:00:25,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=357940.0, ans=0.125 2023-09-29 12:00:26,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=357940.0, ans=0.125 2023-09-29 12:00:28,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:28,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:00:28,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:00:29,161 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.56 vs. limit=15.0 2023-09-29 12:00:30,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:31,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 12:00:33,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 12:00:34,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:34,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:34,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:00:34,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:00:37,099 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.95 vs. limit=22.5 2023-09-29 12:00:38,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:00:40,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:00:41,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=358006.6666666667, ans=0.0 2023-09-29 12:00:43,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:00:43,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:43,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 12:00:46,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:00:48,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:49,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:00:49,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:51,043 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.069e+02 2.330e+02 2.802e+02 5.186e+02, threshold=4.661e+02, percent-clipped=1.0 2023-09-29 12:00:53,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:00:53,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:00:59,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 12:01:02,382 INFO [train.py:1039] (0/4) Epoch 11, batch 600, loss[loss=0.214, simple_loss=0.2653, pruned_loss=0.08139, over 23707.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2747, pruned_loss=0.06797, over 4482618.73 frames. ], batch size: 232, lr: 9.62e-03, grad_scale: 16.0 2023-09-29 12:01:02,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 12:01:04,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:01:04,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:01:06,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:12,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=358140.0, ans=0.125 2023-09-29 12:01:14,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:01:14,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:01:14,982 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.09 vs. limit=15.0 2023-09-29 12:01:17,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 12:01:20,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:01:20,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:22,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:25,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 12:01:25,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:01:31,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 12:01:34,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:01:34,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:34,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:01:39,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:01:39,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:01:41,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:49,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:01:53,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:53,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:53,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:02:02,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 12:02:02,860 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=358340.0, ans=0.0 2023-09-29 12:02:07,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:02:07,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:07,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=358406.6666666667, ans=0.1 2023-09-29 12:02:09,904 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=358406.6666666667, ans=0.125 2023-09-29 12:02:13,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 12:02:13,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:02:17,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 12:02:17,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:02:17,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:02:24,979 INFO [train.py:1039] (0/4) Epoch 11, batch 650, loss[loss=0.1779, simple_loss=0.2488, pruned_loss=0.05349, over 24344.00 frames. ], tot_loss[loss=0.2038, simple_loss=0.2731, pruned_loss=0.06723, over 4535887.31 frames. ], batch size: 56, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:02:25,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:02:26,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:02:29,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:02:31,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:02:34,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:02:34,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=358473.3333333333, ans=0.04949747468305833 2023-09-29 12:02:35,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 12:02:37,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:02:43,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:02:43,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:47,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:02:51,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 12:02:53,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:02:55,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:58,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:58,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:03:01,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:01,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:02,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:03:03,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:05,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:03:09,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:03:09,450 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 12:03:09,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:09,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:09,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=358606.6666666667, ans=0.1 2023-09-29 12:03:13,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:14,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:14,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:14,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:03:16,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 12:03:18,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:03:18,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:03:18,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:03:18,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:21,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:03:23,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 12:03:24,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 12:03:24,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:24,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:24,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:03:26,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:03:28,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:03:33,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:33,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:03:35,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:37,066 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.924e+02 2.251e+02 2.757e+02 4.294e+02, threshold=4.503e+02, percent-clipped=0.0 2023-09-29 12:03:37,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:37,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:03:37,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:46,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:03:46,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:46,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:03:47,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:48,382 INFO [train.py:1039] (0/4) Epoch 11, batch 700, loss[loss=0.2038, simple_loss=0.2683, pruned_loss=0.0696, over 23900.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2723, pruned_loss=0.06593, over 4584618.87 frames. ], batch size: 195, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:03:48,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=358806.6666666667, ans=0.125 2023-09-29 12:03:52,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 12:03:52,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 12:03:55,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 12:03:56,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:58,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:04:00,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 12:04:03,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:09,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:04:10,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:12,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:04:12,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:04:15,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:17,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:04:17,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:04:20,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 12:04:23,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 12:04:27,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:04:28,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:04:30,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:04:34,401 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=358940.0, ans=0.035 2023-09-29 12:04:35,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:04:35,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 12:04:39,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=359006.6666666667, ans=0.2 2023-09-29 12:04:41,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:41,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:04:42,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 12:04:45,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:47,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:51,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:04:57,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:04:57,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 12:05:00,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 12:05:00,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 12:05:05,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:06,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:08,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:10,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:10,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 12:05:12,363 INFO [train.py:1039] (0/4) Epoch 11, batch 750, loss[loss=0.2177, simple_loss=0.294, pruned_loss=0.0707, over 24067.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2718, pruned_loss=0.0655, over 4613140.67 frames. ], batch size: 80, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:05:12,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=359140.0, ans=0.2 2023-09-29 12:05:15,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 12:05:15,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 12:05:15,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 12:05:17,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 12:05:17,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 12:05:17,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:05:18,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 12:05:20,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:20,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:23,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:25,196 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=359140.0, ans=0.2 2023-09-29 12:05:26,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:26,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:05:26,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:28,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:05:29,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:05:31,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:05:33,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:34,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:36,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 12:05:36,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:05:39,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:39,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:41,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:05:43,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 12:05:43,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:45,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 12:05:45,140 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 12:05:47,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 12:05:47,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:05:47,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:05:50,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:05:55,745 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.78 vs. limit=15.0 2023-09-29 12:05:57,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:57,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:05:57,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:06:00,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:06:02,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 12:06:05,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:06:06,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 12:06:07,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:06:07,255 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:06:12,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:06:14,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 12:06:14,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:18,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:20,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:06:22,363 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.014e+02 2.278e+02 2.730e+02 4.361e+02, threshold=4.557e+02, percent-clipped=0.0 2023-09-29 12:06:22,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:22,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=359406.6666666667, ans=0.0 2023-09-29 12:06:25,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:06:29,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 12:06:29,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:30,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:32,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:33,808 INFO [train.py:1039] (0/4) Epoch 11, batch 800, loss[loss=0.2077, simple_loss=0.2921, pruned_loss=0.06162, over 24626.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2723, pruned_loss=0.06536, over 4647459.23 frames. ], batch size: 73, lr: 9.60e-03, grad_scale: 32.0 2023-09-29 12:06:33,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:35,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:35,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:06:45,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:45,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:47,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:47,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:50,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:50,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:52,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:55,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:57,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:07:00,019 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=359540.0, ans=0.2 2023-09-29 12:07:01,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 12:07:02,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:02,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:07:02,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:04,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:04,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 12:07:04,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:04,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 12:07:08,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:11,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:12,517 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.88 vs. limit=15.0 2023-09-29 12:07:13,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:07:13,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:16,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:16,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:20,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:07:20,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:07:22,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 12:07:23,714 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 12:07:23,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 12:07:23,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:07:23,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:07:27,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:27,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:07:27,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=359673.3333333333, ans=0.125 2023-09-29 12:07:27,540 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:07:32,511 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 12:07:33,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 12:07:35,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:07:37,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:07:40,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=359740.0, ans=0.125 2023-09-29 12:07:41,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:07:42,433 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.56 vs. limit=10.0 2023-09-29 12:07:44,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:46,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 12:07:47,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:50,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 12:07:55,883 INFO [train.py:1039] (0/4) Epoch 11, batch 850, loss[loss=0.208, simple_loss=0.2905, pruned_loss=0.06273, over 23966.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2734, pruned_loss=0.06565, over 4672626.72 frames. ], batch size: 80, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:07:56,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:07:57,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:07:59,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 12:08:00,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:08:01,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:02,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 12:08:02,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:05,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:08:06,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:08,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:08:10,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:08:10,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 12:08:11,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 12:08:11,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 12:08:13,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:08:13,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:08:15,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:15,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:16,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:08:21,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:21,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:21,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 12:08:24,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 12:08:27,364 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.19 vs. limit=15.0 2023-09-29 12:08:29,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:31,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 12:08:34,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 12:08:36,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 12:08:38,074 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=359940.0, ans=0.125 2023-09-29 12:08:40,051 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 12:08:40,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:40,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:08:40,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:08:44,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:45,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:46,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 12:08:48,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:50,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:50,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:08:50,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:08:51,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:08:53,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:08:54,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 12:08:58,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:08:58,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:08:59,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:08:59,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:08:59,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=360006.6666666667, ans=0.0 2023-09-29 12:09:01,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:01,555 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=360073.3333333333, ans=0.2 2023-09-29 12:09:03,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:09:04,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:09:06,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:09:07,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:07,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:09:08,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=360073.3333333333, ans=0.0 2023-09-29 12:09:09,284 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.084e+02 2.353e+02 2.728e+02 3.950e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 12:09:16,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:09:18,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:09:19,685 INFO [train.py:1039] (0/4) Epoch 11, batch 900, loss[loss=0.212, simple_loss=0.2741, pruned_loss=0.07493, over 23755.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.275, pruned_loss=0.06685, over 4683504.30 frames. ], batch size: 179, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:09:19,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 12:09:19,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:19,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:09:22,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 12:09:23,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=360140.0, ans=0.0 2023-09-29 12:09:27,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:09:30,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:30,839 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=360140.0, ans=0.0 2023-09-29 12:09:32,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 12:09:35,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:09:37,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 12:09:38,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 12:09:38,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:38,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:09:40,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:09:40,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:09:40,605 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=360206.6666666667, ans=10.0 2023-09-29 12:09:50,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:52,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:52,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:09:56,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:02,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 12:10:04,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:10:07,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:10:07,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:10:09,098 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 12:10:11,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 12:10:15,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:10:15,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:10:17,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:10:24,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:24,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:10:27,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 12:10:27,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:28,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 12:10:30,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:10:31,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:31,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:10:31,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:10:37,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 12:10:37,912 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 12:10:39,419 INFO [train.py:1039] (0/4) Epoch 11, batch 950, loss[loss=0.2082, simple_loss=0.2707, pruned_loss=0.07279, over 20570.00 frames. ], tot_loss[loss=0.2039, simple_loss=0.2746, pruned_loss=0.06663, over 4691454.69 frames. ], batch size: 45, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:10:40,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:10:40,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 12:10:43,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:43,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=360473.3333333333, ans=0.125 2023-09-29 12:10:46,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 12:10:51,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:10:52,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:52,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=360473.3333333333, ans=0.1 2023-09-29 12:10:54,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:54,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:10:55,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=360540.0, ans=0.1 2023-09-29 12:10:58,495 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 12:11:01,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:01,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:03,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:03,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:11:03,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 12:11:04,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:11:06,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:06,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=360540.0, ans=0.125 2023-09-29 12:11:08,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 12:11:08,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:10,367 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.06 vs. limit=22.5 2023-09-29 12:11:12,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:12,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:12,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:11:14,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 12:11:14,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=360606.6666666667, ans=0.0 2023-09-29 12:11:16,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:11:17,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:19,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:11:24,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:11:24,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:31,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 12:11:31,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:11:31,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:11:31,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:31,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:31,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:11:37,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 12:11:37,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:11:42,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:42,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:42,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 12:11:43,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:43,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:11:43,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 12:11:48,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:11:50,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:50,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=360740.0, ans=0.1 2023-09-29 12:11:50,991 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.54 vs. limit=22.5 2023-09-29 12:11:51,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.923e+02 2.189e+02 2.546e+02 4.043e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 12:11:52,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=360740.0, ans=0.0 2023-09-29 12:11:53,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:11:53,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 12:11:55,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 12:12:00,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:12:02,420 INFO [train.py:1039] (0/4) Epoch 11, batch 1000, loss[loss=0.2236, simple_loss=0.2807, pruned_loss=0.08323, over 23752.00 frames. ], tot_loss[loss=0.2047, simple_loss=0.2743, pruned_loss=0.06752, over 4678706.09 frames. ], batch size: 179, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:12:05,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 12:12:05,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:10,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:12:11,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 12:12:11,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 12:12:12,469 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.43 vs. limit=15.0 2023-09-29 12:12:16,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:16,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:12:17,266 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=15.0 2023-09-29 12:12:19,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:21,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 12:12:24,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 12:12:26,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 12:12:26,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:29,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 12:12:31,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 12:12:31,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 12:12:33,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:35,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:44,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:44,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:12:45,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:47,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:47,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 12:12:47,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:48,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:12:49,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:49,169 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 12:12:49,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=361006.6666666667, ans=0.1 2023-09-29 12:12:55,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 12:12:55,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 12:12:56,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 12:12:59,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:13:09,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:09,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:13:10,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:10,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:13:12,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 12:13:13,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:13:13,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 12:13:15,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 12:13:16,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:16,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:13:18,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:13:20,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:13:21,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:13:23,133 INFO [train.py:1039] (0/4) Epoch 11, batch 1050, loss[loss=0.2115, simple_loss=0.2689, pruned_loss=0.07701, over 23816.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2724, pruned_loss=0.06707, over 4678938.87 frames. ], batch size: 195, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:13:24,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:13:26,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:13:27,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:13:29,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:29,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=361140.0, ans=0.0 2023-09-29 12:13:32,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:13:35,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:13:36,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:13:40,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:13:42,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:13:42,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:13:43,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:13:44,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 12:13:45,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:13:46,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 12:13:49,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:49,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 12:13:49,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:13:54,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=361273.3333333333, ans=0.1 2023-09-29 12:13:55,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:57,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:13:57,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:14:00,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 12:14:00,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 12:14:00,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:14:03,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 12:14:06,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 12:14:08,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:12,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:14:12,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:14:14,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:14:16,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:14:21,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:14:24,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 12:14:26,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 12:14:26,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 12:14:26,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:27,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:14:27,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 12:14:27,982 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=361406.6666666667, ans=0.125 2023-09-29 12:14:32,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:14:33,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:33,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:14:34,680 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.50 vs. limit=15.0 2023-09-29 12:14:35,263 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.858e+02 2.245e+02 2.592e+02 4.386e+02, threshold=4.489e+02, percent-clipped=1.0 2023-09-29 12:14:35,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:35,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:37,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=361406.6666666667, ans=0.0 2023-09-29 12:14:39,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:39,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 12:14:42,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:42,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 12:14:42,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 12:14:42,231 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=361406.6666666667, ans=0.2 2023-09-29 12:14:43,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:14:44,800 INFO [train.py:1039] (0/4) Epoch 11, batch 1100, loss[loss=0.1738, simple_loss=0.2453, pruned_loss=0.0511, over 24341.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2721, pruned_loss=0.06596, over 4704066.08 frames. ], batch size: 56, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:14:45,184 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=361473.3333333333, ans=0.125 2023-09-29 12:14:45,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=361473.3333333333, ans=0.5 2023-09-29 12:14:47,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:14:52,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:14:53,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=361473.3333333333, ans=0.0 2023-09-29 12:14:57,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:14:59,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:14:59,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:00,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 12:15:00,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:02,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:15:02,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=361540.0, ans=0.125 2023-09-29 12:15:05,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:15:08,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:15:08,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 12:15:10,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:15:11,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:11,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:15:13,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:15:15,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:15:19,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:15:23,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 12:15:23,260 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=361606.6666666667, ans=0.0 2023-09-29 12:15:24,383 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 12:15:24,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:28,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:29,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:15:29,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:15:31,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 12:15:31,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:15:31,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:15:31,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:15:32,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:32,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 12:15:36,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=361673.3333333333, ans=0.1 2023-09-29 12:15:39,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:15:39,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 12:15:40,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:15:44,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:15:50,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 12:15:50,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:15:51,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:51,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=361740.0, ans=0.125 2023-09-29 12:15:54,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:54,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=361740.0, ans=0.0 2023-09-29 12:15:56,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:57,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 12:15:59,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:15:59,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:16:01,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 12:16:02,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:16:02,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 12:16:04,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:16:04,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:16:05,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:16:08,845 INFO [train.py:1039] (0/4) Epoch 11, batch 1150, loss[loss=0.1893, simple_loss=0.266, pruned_loss=0.05627, over 24666.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2724, pruned_loss=0.06588, over 4713452.27 frames. ], batch size: 65, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:16:09,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:10,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:16:11,599 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.05 vs. limit=15.0 2023-09-29 12:16:13,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:13,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:16:13,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 12:16:13,806 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=361806.6666666667, ans=0.125 2023-09-29 12:16:13,948 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=361806.6666666667, ans=0.09899494936611666 2023-09-29 12:16:15,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:18,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 12:16:18,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:18,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:16:25,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 12:16:27,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:30,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:32,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:33,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 12:16:33,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:16:33,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:35,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=361873.3333333333, ans=0.0 2023-09-29 12:16:40,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 12:16:40,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:41,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:43,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=361940.0, ans=0.5 2023-09-29 12:16:45,342 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=361940.0, ans=0.0 2023-09-29 12:16:51,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:53,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=361940.0, ans=0.1 2023-09-29 12:16:58,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:17:00,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 12:17:00,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:00,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:08,479 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 12:17:10,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:19,603 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 12:17:21,037 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.993e+02 2.217e+02 2.557e+02 3.633e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 12:17:24,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:25,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:17:25,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:17:25,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:17:30,748 INFO [train.py:1039] (0/4) Epoch 11, batch 1200, loss[loss=0.2159, simple_loss=0.2853, pruned_loss=0.07321, over 23341.00 frames. ], tot_loss[loss=0.2025, simple_loss=0.2727, pruned_loss=0.06611, over 4709035.89 frames. ], batch size: 105, lr: 9.57e-03, grad_scale: 32.0 2023-09-29 12:17:30,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:36,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:17:36,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:17:41,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:17:41,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:42,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:17:44,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:17:45,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:17:47,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:47,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:48,923 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 12:17:49,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=362206.6666666667, ans=0.125 2023-09-29 12:17:52,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 12:17:55,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:17:58,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:18:00,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:01,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:01,885 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 12:18:03,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:12,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:18:12,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:18:12,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 12:18:12,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:18:17,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 12:18:20,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 12:18:20,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:20,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:18:21,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=362340.0, ans=0.07 2023-09-29 12:18:22,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:22,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:18:25,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:25,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:18:26,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:18:27,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 12:18:28,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:18:28,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:29,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:18:29,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=362340.0, ans=0.0 2023-09-29 12:18:30,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:30,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:35,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:18:36,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:18:40,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 12:18:46,455 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 12:18:48,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:51,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:52,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:18:52,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=362473.3333333333, ans=0.2 2023-09-29 12:18:53,990 INFO [train.py:1039] (0/4) Epoch 11, batch 1250, loss[loss=0.2243, simple_loss=0.2862, pruned_loss=0.08119, over 22703.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2739, pruned_loss=0.0664, over 4724684.92 frames. ], batch size: 322, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:18:54,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:57,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 12:18:58,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=362473.3333333333, ans=0.1 2023-09-29 12:19:00,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=362473.3333333333, ans=0.125 2023-09-29 12:19:01,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:19:03,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:05,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 12:19:06,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:19:08,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:19:11,786 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=362540.0, ans=0.0 2023-09-29 12:19:13,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:19:13,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:14,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:19:14,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:18,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:19:23,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:19:23,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:19:24,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:19:26,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:27,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:28,190 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:19:28,624 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.91 vs. limit=22.5 2023-09-29 12:19:30,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:32,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:19:37,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 12:19:37,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:19:39,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:19:39,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 12:19:41,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:41,288 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 12:19:41,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:41,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:47,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:48,113 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.56 vs. limit=15.0 2023-09-29 12:19:48,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:50,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:19:51,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 12:19:51,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 12:19:53,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 12:19:57,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:19:59,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 12:19:59,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:02,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:20:02,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:20:02,557 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=362740.0, ans=0.125 2023-09-29 12:20:05,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 12:20:05,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:20:05,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:20:06,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:20:06,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:08,085 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.898e+02 2.110e+02 2.286e+02 3.124e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-29 12:20:08,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 12:20:11,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:12,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:20:14,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:20:16,398 INFO [train.py:1039] (0/4) Epoch 11, batch 1300, loss[loss=0.1919, simple_loss=0.2744, pruned_loss=0.05474, over 24657.00 frames. ], tot_loss[loss=0.2042, simple_loss=0.2744, pruned_loss=0.06693, over 4721563.85 frames. ], batch size: 73, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:20:16,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:20:21,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:21,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 12:20:24,605 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=362806.6666666667, ans=0.0 2023-09-29 12:20:27,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:29,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:20:29,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:20:31,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:31,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:20:33,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 12:20:39,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:20:40,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:20:40,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 12:20:44,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:20:46,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=362873.3333333333, ans=0.125 2023-09-29 12:20:47,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:48,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:50,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:52,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:53,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:20:55,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:20:55,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 12:21:01,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:21:01,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:21:04,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 12:21:04,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:21:06,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:21:08,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:21:09,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 12:21:09,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:09,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 12:21:12,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:15,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:21:15,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:21:18,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 12:21:20,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 12:21:21,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 12:21:25,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:21:28,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 12:21:29,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:34,453 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.91 vs. limit=15.0 2023-09-29 12:21:39,103 INFO [train.py:1039] (0/4) Epoch 11, batch 1350, loss[loss=0.1967, simple_loss=0.25, pruned_loss=0.07166, over 23527.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2731, pruned_loss=0.06698, over 4719293.06 frames. ], batch size: 256, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:21:39,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 12:21:42,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:45,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:21:48,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:48,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:50,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:21:50,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:55,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:56,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 12:21:58,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:21:58,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:22:01,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 12:22:03,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:22:03,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:22:03,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 12:22:05,527 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=363206.6666666667, ans=0.0 2023-09-29 12:22:06,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 12:22:09,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 12:22:09,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=363206.6666666667, ans=0.2 2023-09-29 12:22:10,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:11,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 12:22:13,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=363273.3333333333, ans=0.0 2023-09-29 12:22:15,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=363273.3333333333, ans=0.2 2023-09-29 12:22:15,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=363273.3333333333, ans=0.125 2023-09-29 12:22:22,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:26,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=363340.0, ans=0.0 2023-09-29 12:22:32,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:32,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:32,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 12:22:36,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:37,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 12:22:38,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:22:39,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:22:41,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:22:41,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=363340.0, ans=0.125 2023-09-29 12:22:44,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 12:22:46,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:22:52,343 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.973e+02 2.286e+02 2.663e+02 4.619e+02, threshold=4.571e+02, percent-clipped=1.0 2023-09-29 12:22:52,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 12:22:54,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 12:22:59,930 INFO [train.py:1039] (0/4) Epoch 11, batch 1400, loss[loss=0.1838, simple_loss=0.2515, pruned_loss=0.05811, over 23649.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2717, pruned_loss=0.06606, over 4714059.37 frames. ], batch size: 135, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:23:01,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 12:23:03,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:23:04,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:23:05,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:23:13,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 12:23:14,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 12:23:28,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:23:29,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:31,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:23:31,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:23:35,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:23:35,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:23:38,204 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.20 vs. limit=22.5 2023-09-29 12:23:47,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:49,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:52,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 12:23:55,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:23:56,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:23:56,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:23:56,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:58,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:23:58,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:23:58,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:24:01,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 12:24:01,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:24:06,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:09,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:24:14,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 12:24:15,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:24:17,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:24:21,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:24:21,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:22,650 INFO [train.py:1039] (0/4) Epoch 11, batch 1450, loss[loss=0.1829, simple_loss=0.2596, pruned_loss=0.05314, over 24452.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2713, pruned_loss=0.06568, over 4719237.50 frames. ], batch size: 63, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:24:24,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:24:27,309 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.87 vs. limit=15.0 2023-09-29 12:24:28,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:24:31,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:24:31,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:31,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:24:33,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=363806.6666666667, ans=0.025 2023-09-29 12:24:36,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:37,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:24:39,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:24:40,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 12:24:42,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:24:42,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 12:24:43,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:43,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:43,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 12:24:45,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:24:45,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:24:45,754 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=363873.3333333333, ans=0.0 2023-09-29 12:24:47,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 12:24:47,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:48,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:24:50,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:52,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=363873.3333333333, ans=0.0 2023-09-29 12:24:53,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:57,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:24:57,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:25:00,132 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.43 vs. limit=6.0 2023-09-29 12:25:01,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:25:01,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:03,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:25:03,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:25:03,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:04,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:09,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 12:25:11,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:25:15,920 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 12:25:16,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:16,352 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=364006.6666666667, ans=0.125 2023-09-29 12:25:17,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:25:19,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:20,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 12:25:23,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:25,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 12:25:27,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 12:25:27,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:33,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:25:33,896 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.44 vs. limit=15.0 2023-09-29 12:25:34,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:36,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 12:25:38,109 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.910e+02 2.158e+02 2.591e+02 3.926e+02, threshold=4.316e+02, percent-clipped=0.0 2023-09-29 12:25:38,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 12:25:39,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 12:25:41,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:42,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:25:45,888 INFO [train.py:1039] (0/4) Epoch 11, batch 1500, loss[loss=0.2025, simple_loss=0.2873, pruned_loss=0.05884, over 24295.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2716, pruned_loss=0.06579, over 4708703.74 frames. ], batch size: 74, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:25:53,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 12:25:54,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:25:54,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:25:55,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:56,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:25:56,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:25:58,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 12:26:01,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:26:01,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:26:01,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:26:03,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:26:04,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:06,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:06,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=364206.6666666667, ans=0.125 2023-09-29 12:26:08,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=364206.6666666667, ans=0.125 2023-09-29 12:26:12,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:12,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 12:26:12,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:26:14,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:26:14,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:17,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 12:26:17,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=364273.3333333333, ans=0.2 2023-09-29 12:26:22,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 12:26:23,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:26:24,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 12:26:26,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:26:28,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=364273.3333333333, ans=0.125 2023-09-29 12:26:29,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:26:30,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:30,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:26:31,187 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:26:33,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 12:26:34,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:26:34,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:35,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 12:26:36,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:42,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:26:42,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 12:26:50,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:26:51,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:26:54,816 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 12:26:54,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:26:54,922 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 12:26:55,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=364406.6666666667, ans=0.1 2023-09-29 12:26:56,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:57,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:26:59,354 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 12:27:00,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:27:02,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 12:27:05,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:06,867 INFO [train.py:1039] (0/4) Epoch 11, batch 1550, loss[loss=0.2073, simple_loss=0.2872, pruned_loss=0.06372, over 24542.00 frames. ], tot_loss[loss=0.2027, simple_loss=0.2729, pruned_loss=0.06628, over 4704972.92 frames. ], batch size: 71, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:27:07,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:27:10,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 12:27:12,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 12:27:12,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:27:12,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=364473.3333333333, ans=0.125 2023-09-29 12:27:13,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 12:27:13,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 12:27:17,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:18,011 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.78 vs. limit=15.0 2023-09-29 12:27:19,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:19,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:21,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:27:21,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:22,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:25,973 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 12:27:26,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:27,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:27:27,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:27:30,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:27:30,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 12:27:32,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:32,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 12:27:33,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 12:27:33,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 12:27:33,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:35,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:36,158 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.04 vs. limit=15.0 2023-09-29 12:27:39,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:27:42,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 12:27:42,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 12:27:43,281 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=364606.6666666667, ans=0.125 2023-09-29 12:27:46,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=364606.6666666667, ans=0.05 2023-09-29 12:27:51,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:57,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:57,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:27:57,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:27:57,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=364673.3333333333, ans=0.0 2023-09-29 12:27:58,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 12:28:01,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:28:03,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:06,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:28:08,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:28:08,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:08,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 12:28:09,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:11,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:28:12,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:12,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=364740.0, ans=0.125 2023-09-29 12:28:14,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:28:14,128 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 12:28:15,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:16,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=364740.0, ans=0.0 2023-09-29 12:28:22,362 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.938e+02 2.255e+02 2.720e+02 4.386e+02, threshold=4.510e+02, percent-clipped=1.0 2023-09-29 12:28:22,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 12:28:28,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:29,652 INFO [train.py:1039] (0/4) Epoch 11, batch 1600, loss[loss=0.2243, simple_loss=0.2869, pruned_loss=0.08083, over 23673.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2736, pruned_loss=0.06628, over 4714201.89 frames. ], batch size: 256, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:28:29,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:31,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 12:28:32,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:34,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:34,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:28:34,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:28:34,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:28:37,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:37,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 12:28:39,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 12:28:40,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 12:28:43,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:28:45,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 12:28:45,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:28:48,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:28:53,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:55,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 12:28:59,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:28:59,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 12:29:01,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:01,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=364940.0, ans=0.125 2023-09-29 12:29:02,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 12:29:08,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 12:29:15,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 12:29:15,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:29:15,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:29:18,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 12:29:23,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 12:29:26,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:29:26,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:28,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:30,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:29:32,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:29:34,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:29:34,713 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=15.0 2023-09-29 12:29:35,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:29:41,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:43,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:29:44,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=365073.3333333333, ans=0.125 2023-09-29 12:29:46,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 12:29:46,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:29:46,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 12:29:50,779 INFO [train.py:1039] (0/4) Epoch 11, batch 1650, loss[loss=0.1982, simple_loss=0.2804, pruned_loss=0.05796, over 24572.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2738, pruned_loss=0.06637, over 4709760.91 frames. ], batch size: 71, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:29:50,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:29:52,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:29:53,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:29:53,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 12:29:53,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 12:29:53,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 12:29:55,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 12:29:57,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=365140.0, ans=0.125 2023-09-29 12:29:58,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:58,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:00,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:00,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:30:02,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:04,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 12:30:04,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=365140.0, ans=0.125 2023-09-29 12:30:07,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:30:07,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:07,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:30:07,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:30:08,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 12:30:08,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 12:30:16,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:30:18,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:30:21,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=365273.3333333333, ans=0.0 2023-09-29 12:30:27,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 12:30:27,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:29,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 12:30:33,552 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.15 vs. limit=22.5 2023-09-29 12:30:34,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:36,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:30:36,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:30:36,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:30:39,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:30:39,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:43,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:44,335 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.05 vs. limit=15.0 2023-09-29 12:30:45,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:45,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:45,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:45,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:48,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:30:51,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:51,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 12:30:51,597 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=365340.0, ans=0.1 2023-09-29 12:30:51,773 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=365340.0, ans=0.5 2023-09-29 12:30:52,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:54,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 12:30:55,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 12:30:57,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 12:30:57,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:57,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:30:58,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:58,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:58,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 12:31:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:31:04,948 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.934e+02 2.140e+02 2.424e+02 3.897e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 12:31:05,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:31:05,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:09,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 12:31:11,712 INFO [train.py:1039] (0/4) Epoch 11, batch 1700, loss[loss=0.2138, simple_loss=0.2679, pruned_loss=0.07986, over 23615.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2733, pruned_loss=0.06637, over 4704175.32 frames. ], batch size: 256, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:31:11,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:11,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:31:14,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 12:31:14,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:15,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:31:15,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:17,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:31:17,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:31:17,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 12:31:20,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:31:23,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=365473.3333333333, ans=0.125 2023-09-29 12:31:25,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=365473.3333333333, ans=0.0 2023-09-29 12:31:30,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:33,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:31:37,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:31:37,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:31:38,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:38,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:31:42,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 12:31:46,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:31:46,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:47,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=365606.6666666667, ans=0.0 2023-09-29 12:31:48,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:31:49,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:31:51,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 12:31:52,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 12:31:54,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:54,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 12:31:56,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:31:56,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=365606.6666666667, ans=15.0 2023-09-29 12:32:03,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:05,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:06,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:32:07,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=365673.3333333333, ans=0.0 2023-09-29 12:32:08,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:32:08,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 12:32:08,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:32:11,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:11,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 12:32:11,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:32:11,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:13,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:13,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:17,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:17,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:32:17,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:19,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:32:19,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:24,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:24,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 12:32:26,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:27,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:29,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 12:32:34,315 INFO [train.py:1039] (0/4) Epoch 11, batch 1750, loss[loss=0.1804, simple_loss=0.253, pruned_loss=0.05393, over 24613.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2725, pruned_loss=0.06599, over 4707931.87 frames. ], batch size: 60, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:32:34,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:36,934 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=8.57 vs. limit=22.5 2023-09-29 12:32:37,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:37,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:32:38,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 12:32:38,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:42,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:32:42,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:47,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 12:32:50,430 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.49 vs. limit=12.0 2023-09-29 12:32:50,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:54,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 12:32:54,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:54,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=365873.3333333333, ans=0.125 2023-09-29 12:32:56,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:32:59,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:33:00,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 12:33:02,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:33:02,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 12:33:05,832 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=365940.0, ans=0.0 2023-09-29 12:33:10,241 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=365940.0, ans=0.125 2023-09-29 12:33:11,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:33:14,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:14,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:19,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:19,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:21,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:33:25,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:27,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:29,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:33:29,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 12:33:32,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:34,958 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.66 vs. limit=6.0 2023-09-29 12:33:35,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 12:33:35,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:38,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:39,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:33:42,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=366073.3333333333, ans=0.1 2023-09-29 12:33:43,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:33:43,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:33:45,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:46,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:48,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=366073.3333333333, ans=0.0 2023-09-29 12:33:49,680 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.016e+02 2.265e+02 2.513e+02 4.125e+02, threshold=4.530e+02, percent-clipped=0.0 2023-09-29 12:33:51,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:54,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:33:55,920 INFO [train.py:1039] (0/4) Epoch 11, batch 1800, loss[loss=0.2076, simple_loss=0.2696, pruned_loss=0.07283, over 23568.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2718, pruned_loss=0.06592, over 4712794.63 frames. ], batch size: 256, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:33:56,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:33:56,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 12:33:56,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:58,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:33:58,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:33:58,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:33:59,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:33:59,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:34:04,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:34:04,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:34:05,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:34:09,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:34:10,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:34:14,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:34:17,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:18,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:34:23,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:34:23,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 12:34:23,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:26,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:31,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 12:34:33,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 12:34:34,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 12:34:34,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:36,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:36,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:34:37,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:34:44,464 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 12:34:45,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:34:47,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:50,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 12:34:50,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 12:34:52,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:34:53,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:34:55,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:35:00,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 12:35:04,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:06,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 12:35:07,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:35:07,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:07,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:35:09,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 12:35:12,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:35:12,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:16,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 12:35:16,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:19,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:19,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:35:19,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,030 INFO [train.py:1039] (0/4) Epoch 11, batch 1850, loss[loss=0.1802, simple_loss=0.25, pruned_loss=0.05519, over 24375.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2717, pruned_loss=0.06569, over 4710591.80 frames. ], batch size: 56, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:35:21,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:35:24,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:35:24,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:27,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:35:28,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:35:32,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=366473.3333333333, ans=0.2 2023-09-29 12:35:35,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:35:35,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 12:35:40,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 12:35:42,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.33 vs. limit=22.5 2023-09-29 12:35:43,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 12:35:44,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=366540.0, ans=0.1 2023-09-29 12:35:48,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:48,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 12:35:48,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:35:53,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=366606.6666666667, ans=0.0 2023-09-29 12:35:53,211 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=366606.6666666667, ans=0.2 2023-09-29 12:35:57,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:58,237 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.35 vs. limit=15.0 2023-09-29 12:35:59,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 12:36:03,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:03,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:05,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 12:36:05,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:05,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:36:07,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:36:10,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:36:12,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:36:12,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=366673.3333333333, ans=0.125 2023-09-29 12:36:17,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:36:17,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:17,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:36:17,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:19,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=366673.3333333333, ans=0.025 2023-09-29 12:36:20,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:22,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:36:24,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=366673.3333333333, ans=0.0 2023-09-29 12:36:26,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 12:36:27,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:29,246 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=366740.0, ans=0.125 2023-09-29 12:36:32,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:36:32,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:36:32,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 12:36:32,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 12:36:32,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=366740.0, ans=0.125 2023-09-29 12:36:33,754 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 12:36:35,310 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 12:36:36,667 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.114e+02 2.440e+02 2.839e+02 4.239e+02, threshold=4.880e+02, percent-clipped=0.0 2023-09-29 12:36:36,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:36:36,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:36,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:36:38,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:38,429 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 12:36:38,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:36:39,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:39,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:36:41,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:36:42,873 INFO [train.py:1039] (0/4) Epoch 11, batch 1900, loss[loss=0.2171, simple_loss=0.2918, pruned_loss=0.07123, over 24440.00 frames. ], tot_loss[loss=0.2025, simple_loss=0.2725, pruned_loss=0.06627, over 4709103.42 frames. ], batch size: 77, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:36:43,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:43,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 12:36:46,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:46,193 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 12:36:46,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:36:47,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:55,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:56,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:36:58,760 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 12:37:00,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 12:37:00,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:37:01,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:37:01,844 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 12:37:01,884 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 12:37:07,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 12:37:09,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:37:13,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 12:37:15,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 12:37:23,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 12:37:25,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 12:37:25,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:37:26,788 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 12:37:26,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 12:37:26,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 12:37:28,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 12:37:28,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:37:33,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 12:37:35,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:37:40,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:37:40,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 12:37:40,877 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.74 vs. limit=15.0 2023-09-29 12:37:43,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:37:47,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 12:37:47,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:37:56,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:37:56,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:37:56,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:37:57,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:37:59,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:37:59,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:38:01,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:38:04,371 INFO [train.py:1039] (0/4) Epoch 11, batch 1950, loss[loss=0.2229, simple_loss=0.2997, pruned_loss=0.07301, over 24073.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2727, pruned_loss=0.06604, over 4716704.02 frames. ], batch size: 80, lr: 9.50e-03, grad_scale: 16.0 2023-09-29 12:38:04,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:04,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:07,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:38:07,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:07,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:38:09,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:13,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:14,255 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=367140.0, ans=0.09899494936611666 2023-09-29 12:38:14,909 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.73 vs. limit=15.0 2023-09-29 12:38:16,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:38:16,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:16,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:38:18,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 12:38:19,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:38:20,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:21,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:24,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:38:24,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:25,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:27,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:38:31,434 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.53 vs. limit=15.0 2023-09-29 12:38:32,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:32,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:38:32,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:38:32,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:35,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:39,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:39,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:39,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:38:39,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 12:38:39,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:38:40,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:38:41,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:44,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:46,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:48,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=367273.3333333333, ans=0.1 2023-09-29 12:38:53,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:38:54,074 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.36 vs. limit=22.5 2023-09-29 12:38:54,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:38:56,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:38:56,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 12:38:56,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:00,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:39:01,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:39:02,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:09,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:11,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:14,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:17,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:19,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:39:19,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:21,096 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=12.0 2023-09-29 12:39:21,450 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.994e+02 2.316e+02 2.639e+02 3.669e+02, threshold=4.632e+02, percent-clipped=0.0 2023-09-29 12:39:21,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 12:39:21,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:39:21,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:39:23,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 12:39:24,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:27,713 INFO [train.py:1039] (0/4) Epoch 11, batch 2000, loss[loss=0.1819, simple_loss=0.2554, pruned_loss=0.05426, over 24323.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2741, pruned_loss=0.06652, over 4711349.03 frames. ], batch size: 61, lr: 9.50e-03, grad_scale: 32.0 2023-09-29 12:39:29,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:30,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:39:31,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=367473.3333333333, ans=0.07 2023-09-29 12:39:32,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:33,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:39:35,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:37,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 12:39:39,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:39:42,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:39:44,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 12:39:46,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:39:46,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:50,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:39:51,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 12:39:54,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:58,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 12:39:59,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:40:02,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 12:40:02,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:03,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:03,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:40:03,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:05,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:06,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:06,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 12:40:11,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 12:40:11,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:11,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:15,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:18,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:40:18,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:18,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:40:19,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:21,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:22,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:22,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:24,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:28,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:29,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 12:40:35,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:40:35,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:40:41,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:44,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:40:44,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:40:47,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=367740.0, ans=0.0 2023-09-29 12:40:48,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:50,403 INFO [train.py:1039] (0/4) Epoch 11, batch 2050, loss[loss=0.2032, simple_loss=0.2662, pruned_loss=0.07006, over 23366.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.273, pruned_loss=0.0668, over 4689412.36 frames. ], batch size: 119, lr: 9.49e-03, grad_scale: 32.0 2023-09-29 12:40:50,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:54,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:55,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:41:01,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:41:03,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:41:03,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:41:03,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=367806.6666666667, ans=0.0 2023-09-29 12:41:05,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:07,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 12:41:07,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:41:08,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:10,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:41:18,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:18,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:21,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 12:41:24,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:26,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 12:41:26,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:30,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:32,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:34,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:41:34,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:36,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:41:36,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:41:36,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:41:40,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:41,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:41:43,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:41:44,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:47,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:41:55,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:56,408 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.32 vs. limit=12.0 2023-09-29 12:41:57,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 12:41:59,854 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:42:01,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:02,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:42:06,927 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.014e+02 2.317e+02 2.683e+02 4.007e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 12:42:07,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:42:08,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 12:42:12,448 INFO [train.py:1039] (0/4) Epoch 11, batch 2100, loss[loss=0.1754, simple_loss=0.2574, pruned_loss=0.04675, over 24501.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2717, pruned_loss=0.06599, over 4702877.67 frames. ], batch size: 63, lr: 9.49e-03, grad_scale: 16.0 2023-09-29 12:42:14,033 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 12:42:14,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:14,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:15,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:15,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:15,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 12:42:15,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=368140.0, ans=0.125 2023-09-29 12:42:17,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 12:42:18,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:42:21,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:42:21,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:42:22,785 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=15.0 2023-09-29 12:42:23,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=368140.0, ans=0.02 2023-09-29 12:42:25,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:25,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:42:25,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 12:42:26,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:42:28,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 12:42:28,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 12:42:32,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:42:32,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:42:32,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 12:42:32,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 12:42:33,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=368206.6666666667, ans=0.125 2023-09-29 12:42:38,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 12:42:38,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:39,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:42:40,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:45,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:42:47,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 12:42:47,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:42:47,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:42:48,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 12:42:48,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:48,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 12:42:48,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=368273.3333333333, ans=0.015 2023-09-29 12:42:50,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 12:42:50,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 12:42:51,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:42:53,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:42:56,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:58,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:59,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:01,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:01,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 12:43:01,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:01,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:03,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:03,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 12:43:04,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 12:43:06,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 12:43:09,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:43:12,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:43:14,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 12:43:19,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:21,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:43:21,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=368406.6666666667, ans=0.125 2023-09-29 12:43:22,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:43:22,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:43:23,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:43:23,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:43:25,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:25,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:43:26,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:43:27,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:28,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 12:43:30,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 12:43:30,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:32,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:32,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:43:34,233 INFO [train.py:1039] (0/4) Epoch 11, batch 2150, loss[loss=0.2332, simple_loss=0.2971, pruned_loss=0.08467, over 23746.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2707, pruned_loss=0.06565, over 4707938.93 frames. ], batch size: 135, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:43:34,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:43:34,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:43:39,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=368473.3333333333, ans=0.125 2023-09-29 12:43:41,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:43:42,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:44,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:44,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:43:44,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:45,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:43:50,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:51,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:43:51,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:43:56,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:56,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 12:44:02,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:04,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:44:04,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=368540.0, ans=0.125 2023-09-29 12:44:05,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:05,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:44:05,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=368606.6666666667, ans=0.1 2023-09-29 12:44:05,975 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368606.6666666667, ans=0.1 2023-09-29 12:44:06,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=368606.6666666667, ans=0.05 2023-09-29 12:44:07,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:07,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:44:07,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:44:09,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 12:44:12,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:44:12,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:14,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:14,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:44:16,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:44:17,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:17,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:44:19,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:19,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 12:44:19,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:44:22,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:24,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:25,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:27,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:44:29,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:29,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:29,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 12:44:32,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 12:44:32,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:44:32,970 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 12:44:33,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:33,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:44:33,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=368673.3333333333, ans=0.0 2023-09-29 12:44:34,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 12:44:34,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:44:34,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 12:44:34,696 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 12:44:34,697 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 12:44:34,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 12:44:37,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:37,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=368673.3333333333, ans=0.125 2023-09-29 12:44:39,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:39,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:44:40,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:40,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368740.0, ans=0.1 2023-09-29 12:44:42,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:44:45,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:45,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:47,484 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=368740.0, ans=0.125 2023-09-29 12:44:52,248 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.938e+02 2.164e+02 2.545e+02 3.667e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-29 12:44:52,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=368740.0, ans=0.07 2023-09-29 12:44:54,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:44:55,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 12:44:55,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=368806.6666666667, ans=0.125 2023-09-29 12:44:56,985 INFO [train.py:1039] (0/4) Epoch 11, batch 2200, loss[loss=0.2117, simple_loss=0.2747, pruned_loss=0.07437, over 23451.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2712, pruned_loss=0.06585, over 4718617.98 frames. ], batch size: 134, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:44:57,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:44:57,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368806.6666666667, ans=0.1 2023-09-29 12:45:02,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:02,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:45:02,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:05,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:45:07,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:45:08,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:45:08,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 12:45:13,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 12:45:14,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=368873.3333333333, ans=0.0 2023-09-29 12:45:15,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:45:20,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 12:45:20,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=368873.3333333333, ans=0.0 2023-09-29 12:45:24,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:24,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:25,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=368873.3333333333, ans=0.125 2023-09-29 12:45:26,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:45:29,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:45:29,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 12:45:33,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:45:36,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:36,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:45:38,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=368940.0, ans=0.0 2023-09-29 12:45:38,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=368940.0, ans=0.125 2023-09-29 12:45:40,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:45:41,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:43,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:45:43,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=368940.0, ans=0.125 2023-09-29 12:45:43,950 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=368940.0, ans=0.0 2023-09-29 12:45:45,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:49,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 12:45:50,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:45:51,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 12:45:55,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:55,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:45:55,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:57,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:58,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:58,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:00,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:01,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:46:01,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:46:04,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:46:07,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:46:07,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:11,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:46:12,616 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 12:46:14,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:46:14,744 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 12:46:16,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:46:17,069 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 12:46:19,823 INFO [train.py:1039] (0/4) Epoch 11, batch 2250, loss[loss=0.1953, simple_loss=0.2788, pruned_loss=0.05594, over 24401.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2717, pruned_loss=0.06589, over 4717521.88 frames. ], batch size: 69, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:46:19,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:20,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:46:21,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:25,094 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 12:46:25,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:46:26,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:27,609 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.71 vs. limit=22.5 2023-09-29 12:46:33,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:46:33,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:46:36,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:38,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:38,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:41,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 12:46:42,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:46:42,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:46:44,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 12:46:46,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:46:46,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:48,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:52,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:52,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=369273.3333333333, ans=0.125 2023-09-29 12:46:53,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:46:53,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:46:55,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 12:46:56,649 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.63 vs. limit=15.0 2023-09-29 12:46:57,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:59,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:47:05,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:07,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:07,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:47:07,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:47:07,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=369273.3333333333, ans=0.0 2023-09-29 12:47:10,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:47:12,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:47:16,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:47:20,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:47:25,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:47:25,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:47:25,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:47:32,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:47:32,594 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=369406.6666666667, ans=0.125 2023-09-29 12:47:35,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:47:35,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 12:47:35,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:36,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:47:38,347 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.418e+02 2.043e+02 2.273e+02 2.723e+02 4.405e+02, threshold=4.547e+02, percent-clipped=1.0 2023-09-29 12:47:38,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 12:47:42,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:47:43,471 INFO [train.py:1039] (0/4) Epoch 11, batch 2300, loss[loss=0.2152, simple_loss=0.279, pruned_loss=0.07571, over 23417.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2729, pruned_loss=0.06646, over 4721923.71 frames. ], batch size: 119, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:47:43,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:48,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:49,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:47:50,484 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.41 vs. limit=15.0 2023-09-29 12:47:51,440 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 12:47:52,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:00,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:48:00,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:48:01,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:03,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:03,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 12:48:03,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:48:03,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=369540.0, ans=0.0 2023-09-29 12:48:05,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:05,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:48:12,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:48:13,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:48:13,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=369540.0, ans=0.1 2023-09-29 12:48:16,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:21,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:48:21,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:24,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:48:26,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:48:29,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:29,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:48:29,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:48:31,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 12:48:38,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:48:38,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:38,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:38,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:48:38,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:39,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 12:48:39,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:48:41,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 12:48:41,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:48:41,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:41,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 12:48:49,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:48:52,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:48:55,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:55,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:48:57,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:49:00,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:49:00,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:02,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:49:04,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 12:49:05,505 INFO [train.py:1039] (0/4) Epoch 11, batch 2350, loss[loss=0.1985, simple_loss=0.2837, pruned_loss=0.05663, over 24452.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2736, pruned_loss=0.06611, over 4725170.69 frames. ], batch size: 69, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:49:11,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:49:12,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 12:49:17,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 12:49:22,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:49:25,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:27,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:28,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 12:49:32,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:49:36,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 12:49:38,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:42,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:49:42,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:45,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:49:46,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=369940.0, ans=0.125 2023-09-29 12:49:47,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 12:49:48,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:49:50,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:50,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:49:50,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:49:55,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:49:57,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 12:49:57,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:50:00,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:50:01,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:50:04,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 12:50:04,703 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.58 vs. limit=15.0 2023-09-29 12:50:05,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:50:07,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 12:50:07,788 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.12 vs. limit=12.0 2023-09-29 12:50:08,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:50:11,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 12:50:13,641 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=370073.3333333333, ans=0.0 2023-09-29 12:50:15,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 12:50:17,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:50:17,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:50:17,988 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 12:50:18,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 12:50:19,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 12:50:22,233 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.64 vs. limit=22.5 2023-09-29 12:50:22,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:50:23,826 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.084e+02 2.454e+02 3.278e+02 4.890e+02, threshold=4.908e+02, percent-clipped=1.0 2023-09-29 12:50:25,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:50:29,134 INFO [train.py:1039] (0/4) Epoch 11, batch 2400, loss[loss=0.2207, simple_loss=0.2787, pruned_loss=0.08132, over 23869.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.273, pruned_loss=0.06574, over 4708145.89 frames. ], batch size: 195, lr: 9.46e-03, grad_scale: 32.0 2023-09-29 12:50:30,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:50:32,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:50:33,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 12:50:34,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 12:50:40,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:50:40,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:50:43,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 12:50:43,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:50:45,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:47,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 12:50:54,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:56,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 12:51:00,161 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.23 vs. limit=6.0 2023-09-29 12:51:01,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=370273.3333333333, ans=0.125 2023-09-29 12:51:02,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:51:07,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 12:51:09,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=370273.3333333333, ans=0.125 2023-09-29 12:51:12,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:13,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:17,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:18,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 12:51:20,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:51:27,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:30,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:51:32,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:51:33,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:51:33,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:51:33,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:51:33,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:35,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:51:35,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:51:40,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:51:40,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:51:40,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 12:51:43,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 12:51:46,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:46,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:46,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 12:51:48,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 12:51:48,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 12:51:48,293 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 12:51:49,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 12:51:50,051 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=370473.3333333333, ans=0.0 2023-09-29 12:51:51,311 INFO [train.py:1039] (0/4) Epoch 11, batch 2450, loss[loss=0.1921, simple_loss=0.2324, pruned_loss=0.07591, over 19120.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.271, pruned_loss=0.06501, over 4694584.60 frames. ], batch size: 389, lr: 9.46e-03, grad_scale: 16.0 2023-09-29 12:51:51,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:51,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:51,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:51:54,661 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 12:51:54,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:54,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:51:56,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=370473.3333333333, ans=0.0 2023-09-29 12:51:59,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:51:59,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:51:59,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=370473.3333333333, ans=0.125 2023-09-29 12:52:03,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:03,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:04,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 12:52:10,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:52:10,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:14,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:52:14,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:52:14,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:52:16,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 12:52:18,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=370540.0, ans=0.0 2023-09-29 12:52:19,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:21,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:52:22,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:52:25,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:52:25,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:52:30,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 12:52:30,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:52:38,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:40,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:40,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:41,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:52:41,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:45,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:52:46,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 12:52:48,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:50,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:52:53,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:52:53,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:59,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:52:59,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 12:53:00,170 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.49 vs. limit=10.0 2023-09-29 12:53:01,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:53:01,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:01,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 12:53:02,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:02,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:53:07,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:53:11,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:11,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:53:12,855 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.006e+02 2.269e+02 2.537e+02 3.932e+02, threshold=4.538e+02, percent-clipped=0.0 2023-09-29 12:53:14,480 INFO [train.py:1039] (0/4) Epoch 11, batch 2500, loss[loss=0.206, simple_loss=0.2819, pruned_loss=0.06506, over 24011.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.27, pruned_loss=0.06468, over 4702834.68 frames. ], batch size: 86, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:53:16,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 12:53:16,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:53:16,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=370806.6666666667, ans=0.0 2023-09-29 12:53:22,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:28,305 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=370806.6666666667, ans=0.1 2023-09-29 12:53:32,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:53:32,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:53:35,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:35,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 12:53:45,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:53:45,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:47,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:53:47,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:53:48,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 12:53:50,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:50,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:53:50,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 12:53:50,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:52,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 12:53:52,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:56,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:56,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:54:00,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:54:00,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 12:54:00,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:04,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:06,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=371006.6666666667, ans=0.125 2023-09-29 12:54:07,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:07,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=371006.6666666667, ans=0.125 2023-09-29 12:54:09,195 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=371006.6666666667, ans=0.0 2023-09-29 12:54:11,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:15,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:20,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:54:23,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 12:54:23,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:54:25,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:54:26,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:54:26,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:54:28,443 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 12:54:28,444 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 12:54:28,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 12:54:31,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:33,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 12:54:33,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 12:54:34,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=371073.3333333333, ans=0.125 2023-09-29 12:54:35,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:36,965 INFO [train.py:1039] (0/4) Epoch 11, batch 2550, loss[loss=0.2161, simple_loss=0.2786, pruned_loss=0.07685, over 23724.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2702, pruned_loss=0.06441, over 4716266.86 frames. ], batch size: 179, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:54:37,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 12:54:40,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 12:54:42,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:45,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:45,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:54:48,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:48,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 12:54:50,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:54:54,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 12:54:54,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=371206.6666666667, ans=0.04949747468305833 2023-09-29 12:54:55,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:54:57,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:00,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:55:00,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 12:55:00,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:00,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:02,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:03,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=371206.6666666667, ans=0.125 2023-09-29 12:55:05,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:55:05,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 12:55:05,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:55:06,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:06,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 12:55:19,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:55:24,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:24,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:24,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:26,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:55:33,578 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.01 vs. limit=15.0 2023-09-29 12:55:34,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:37,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:37,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:55:37,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:55:37,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:55:37,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:55:40,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:40,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:48,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:55:48,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 12:55:48,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:55:49,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:49,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:55:51,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:55:51,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:55:57,878 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.868e+02 2.141e+02 2.517e+02 4.100e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 12:55:59,462 INFO [train.py:1039] (0/4) Epoch 11, batch 2600, loss[loss=0.2786, simple_loss=0.3246, pruned_loss=0.1163, over 19260.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.2714, pruned_loss=0.06471, over 4712489.14 frames. ], batch size: 388, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:55:59,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:01,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:04,870 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 12:56:06,534 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 12:56:06,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:56:08,046 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 12:56:08,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 12:56:08,204 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 12:56:11,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:56:11,270 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 12:56:12,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 12:56:17,022 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 12:56:18,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:56:20,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 12:56:21,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 12:56:22,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:56:23,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 12:56:25,964 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 12:56:25,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 12:56:34,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:34,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:34,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:34,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 12:56:36,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:56:42,637 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 12:56:47,462 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:56:48,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:48,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:50,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 12:56:52,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:56:52,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:52,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 12:56:52,532 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=371673.3333333333, ans=0.125 2023-09-29 12:56:55,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:56:57,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:58,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:56:59,571 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.92 vs. limit=15.0 2023-09-29 12:57:03,927 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 12:57:03,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:57:08,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:57:10,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:57:10,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 12:57:11,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:57:13,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:13,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:19,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 12:57:20,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:21,937 INFO [train.py:1039] (0/4) Epoch 11, batch 2650, loss[loss=0.188, simple_loss=0.2692, pruned_loss=0.05333, over 24654.00 frames. ], tot_loss[loss=0.2011, simple_loss=0.2723, pruned_loss=0.06496, over 4716620.04 frames. ], batch size: 68, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:57:23,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:57:28,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 12:57:28,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:30,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:57:30,341 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 12:57:30,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:57:32,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:34,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:57:35,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:38,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:38,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 12:57:40,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:57:40,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:57:43,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 12:57:43,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=371873.3333333333, ans=0.125 2023-09-29 12:57:44,761 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 12:57:45,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=371873.3333333333, ans=0.125 2023-09-29 12:57:48,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:49,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 12:57:49,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:57:49,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 12:57:50,799 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.75 vs. limit=6.0 2023-09-29 12:57:55,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:55,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:57:55,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:56,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:00,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 12:58:00,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 12:58:03,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:07,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 12:58:07,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:09,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:10,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:10,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:10,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:13,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:13,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:15,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:58:15,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:58:16,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:58:20,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:20,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:58:21,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:23,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:23,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:58:26,693 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=372073.3333333333, ans=0.5 2023-09-29 12:58:26,738 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=372073.3333333333, ans=0.125 2023-09-29 12:58:27,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:27,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:58:27,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:28,222 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:58:28,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=372073.3333333333, ans=0.05 2023-09-29 12:58:29,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 12:58:36,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:38,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:38,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:39,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:39,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:41,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:43,264 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.976e+02 2.292e+02 2.609e+02 3.713e+02, threshold=4.584e+02, percent-clipped=0.0 2023-09-29 12:58:44,819 INFO [train.py:1039] (0/4) Epoch 11, batch 2700, loss[loss=0.2221, simple_loss=0.2938, pruned_loss=0.07515, over 23215.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2726, pruned_loss=0.06549, over 4708219.93 frames. ], batch size: 105, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:58:44,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:58:44,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 12:58:46,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:58:47,043 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-09-29 12:58:48,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 12:58:49,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:49,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:49,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:51,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:58:51,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:51,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:58:51,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:58:53,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 12:58:54,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:58:56,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:57,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:58:59,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:59:02,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:59:04,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 12:59:04,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:08,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:59:08,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:14,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:59:14,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:59:14,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:59:14,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:59:19,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:21,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:21,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:59:21,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:59:25,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:27,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:59:28,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=372273.3333333333, ans=0.0 2023-09-29 12:59:37,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:59:37,919 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=372340.0, ans=0.125 2023-09-29 12:59:38,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:59:41,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:59:41,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:59:45,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:45,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:47,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:50,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:59:52,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:52,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:59:55,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:55,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:56,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:59,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 13:00:01,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:03,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:00:03,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 13:00:05,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 13:00:05,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=372473.3333333333, ans=0.1 2023-09-29 13:00:06,670 INFO [train.py:1039] (0/4) Epoch 11, batch 2750, loss[loss=0.2054, simple_loss=0.2835, pruned_loss=0.06367, over 24074.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2722, pruned_loss=0.06523, over 4715819.18 frames. ], batch size: 80, lr: 9.43e-03, grad_scale: 8.0 2023-09-29 13:00:06,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:10,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:10,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:12,139 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=372473.3333333333, ans=0.0 2023-09-29 13:00:13,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:13,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:00:13,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:17,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:00:17,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:00:17,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 13:00:19,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:00:19,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:24,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 13:00:26,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=372540.0, ans=0.0 2023-09-29 13:00:27,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:00:27,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:27,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:00:29,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:00:29,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:31,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:00:32,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:32,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:37,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:00:37,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:00:37,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:00:39,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:41,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:00:47,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:49,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:00:50,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:53,234 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=13.85 vs. limit=15.0 2023-09-29 13:00:57,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:57,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:00:57,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:01:02,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:01:02,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=372673.3333333333, ans=0.0 2023-09-29 13:01:03,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:01:03,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 13:01:08,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:09,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 13:01:13,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=372740.0, ans=0.0 2023-09-29 13:01:15,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:01:18,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:01:19,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 13:01:20,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:01:23,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:01:23,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 13:01:24,100 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.87 vs. limit=22.5 2023-09-29 13:01:24,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:01:25,972 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.991e+02 2.210e+02 2.554e+02 4.000e+02, threshold=4.420e+02, percent-clipped=0.0 2023-09-29 13:01:27,624 INFO [train.py:1039] (0/4) Epoch 11, batch 2800, loss[loss=0.2163, simple_loss=0.289, pruned_loss=0.07179, over 24014.00 frames. ], tot_loss[loss=0.2003, simple_loss=0.2708, pruned_loss=0.06488, over 4720223.60 frames. ], batch size: 86, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:01:27,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:01:27,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:29,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:01:29,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 13:01:29,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:29,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:32,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:33,598 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 13:01:33,599 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 13:01:36,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:38,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:01:39,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:01:42,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:01:43,092 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=372873.3333333333, ans=0.1 2023-09-29 13:01:44,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 13:01:47,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:01:49,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 13:01:51,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:52,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:01:52,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:01:56,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:01:56,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:56,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:01:56,614 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.93 vs. limit=15.0 2023-09-29 13:01:57,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:02:00,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=372940.0, ans=0.125 2023-09-29 13:02:01,219 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.46 vs. limit=15.0 2023-09-29 13:02:04,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:02:07,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:09,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=372940.0, ans=0.125 2023-09-29 13:02:10,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:12,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:02:13,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:18,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:18,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 13:02:19,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:20,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:20,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:02:25,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:25,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:29,952 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-09-29 13:02:30,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:32,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:02:32,698 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.26 vs. limit=22.5 2023-09-29 13:02:33,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:33,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:02:33,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:02:34,382 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.38 vs. limit=15.0 2023-09-29 13:02:34,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:02:35,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:02:35,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 13:02:36,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:02:38,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 13:02:38,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=373073.3333333333, ans=0.125 2023-09-29 13:02:40,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:40,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:02:40,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:02:41,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 13:02:49,520 INFO [train.py:1039] (0/4) Epoch 11, batch 2850, loss[loss=0.175, simple_loss=0.2413, pruned_loss=0.0544, over 18990.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2701, pruned_loss=0.06399, over 4722936.02 frames. ], batch size: 41, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:02:49,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:49,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:02:51,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:02:53,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:02:58,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:02:58,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:58,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:03:01,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:01,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:03:03,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:03:03,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 13:03:10,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 13:03:10,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:11,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 13:03:13,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:14,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 13:03:14,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 13:03:16,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:20,305 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=373206.6666666667, ans=0.0 2023-09-29 13:03:25,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=373273.3333333333, ans=0.125 2023-09-29 13:03:30,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:32,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:32,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:03:33,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:03:33,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:03:34,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=373273.3333333333, ans=0.0 2023-09-29 13:03:35,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:03:37,157 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-56000.pt 2023-09-29 13:03:40,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:03:41,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 13:03:42,229 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=373340.0, ans=0.0 2023-09-29 13:03:43,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:03:43,971 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.29 vs. limit=15.0 2023-09-29 13:03:44,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:03:44,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:44,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=373340.0, ans=0.0 2023-09-29 13:03:46,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:48,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:48,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:48,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=373340.0, ans=0.1 2023-09-29 13:03:50,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:53,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:55,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:03:55,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:56,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:58,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:03:59,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=373406.6666666667, ans=0.2 2023-09-29 13:04:04,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:04:06,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 13:04:06,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 13:04:10,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:04:10,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:10,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 13:04:10,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:04:11,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:13,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:13,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:04:13,409 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 13:04:13,474 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 13:04:14,717 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.974e+02 2.166e+02 2.699e+02 4.540e+02, threshold=4.331e+02, percent-clipped=1.0 2023-09-29 13:04:14,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:14,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:16,401 INFO [train.py:1039] (0/4) Epoch 11, batch 2900, loss[loss=0.199, simple_loss=0.2795, pruned_loss=0.05923, over 24647.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2707, pruned_loss=0.06444, over 4725349.01 frames. ], batch size: 73, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:04:18,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:18,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:18,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:04:19,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 13:04:20,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=373473.3333333333, ans=0.0 2023-09-29 13:04:20,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=373473.3333333333, ans=0.125 2023-09-29 13:04:23,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=373473.3333333333, ans=0.025 2023-09-29 13:04:25,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:26,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 13:04:26,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 13:04:28,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:04:28,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:04:30,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:04:35,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=373540.0, ans=0.2 2023-09-29 13:04:36,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:36,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:40,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:04:40,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 13:04:41,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:04:43,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:45,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 13:04:45,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 13:04:48,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:48,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 13:04:48,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:04:49,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:04:49,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:50,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=373606.6666666667, ans=0.125 2023-09-29 13:04:52,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:53,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:58,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:05:01,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:04,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 13:05:04,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 13:05:04,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:05:07,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:05:10,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 13:05:13,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:05:19,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:05:19,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=373673.3333333333, ans=0.0 2023-09-29 13:05:19,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=373673.3333333333, ans=0.125 2023-09-29 13:05:27,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:05:28,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:05:30,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 13:05:34,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:34,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 13:05:34,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:35,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:05:39,102 INFO [train.py:1039] (0/4) Epoch 11, batch 2950, loss[loss=0.1882, simple_loss=0.2611, pruned_loss=0.05766, over 24478.00 frames. ], tot_loss[loss=0.2008, simple_loss=0.2716, pruned_loss=0.06505, over 4722070.67 frames. ], batch size: 63, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:05:43,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:45,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 13:05:45,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:05:45,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:45,481 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=373806.6666666667, ans=0.035 2023-09-29 13:05:46,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:05:48,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:05:48,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 13:05:50,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 13:05:50,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=373806.6666666667, ans=0.1 2023-09-29 13:05:52,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:05:52,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:06:00,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:01,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:04,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:06:04,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:08,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:08,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:06:08,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=373873.3333333333, ans=0.0 2023-09-29 13:06:11,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:12,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:12,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:06:14,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 13:06:16,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=373940.0, ans=0.125 2023-09-29 13:06:20,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 13:06:20,915 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 13:06:22,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:06:24,007 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 13:06:24,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 13:06:25,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:27,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:27,065 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 13:06:27,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:06:27,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=374006.6666666667, ans=0.1 2023-09-29 13:06:29,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 13:06:31,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:32,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:06:34,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:34,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:06:36,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:36,082 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 13:06:36,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:37,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 13:06:45,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:45,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:06:47,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 13:06:47,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:06:48,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 13:06:50,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:06:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:53,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:06:55,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:55,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:06:55,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=374073.3333333333, ans=0.1 2023-09-29 13:06:58,222 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 2.184e+02 2.499e+02 3.162e+02 5.312e+02, threshold=4.998e+02, percent-clipped=4.0 2023-09-29 13:06:58,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:06:59,752 INFO [train.py:1039] (0/4) Epoch 11, batch 3000, loss[loss=0.1961, simple_loss=0.2701, pruned_loss=0.061, over 23308.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2722, pruned_loss=0.06519, over 4723049.65 frames. ], batch size: 93, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:06:59,753 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 13:07:11,689 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.0.layers.0.self_attn_weights, attn_weights_entropy = tensor([5.6884, 5.1321, 5.3434, 5.5496], device='cuda:0') 2023-09-29 13:07:13,670 INFO [train.py:1071] (0/4) Epoch 11, validation: loss=0.3146, simple_loss=0.2865, pruned_loss=0.1713, over 1125622.00 frames. 2023-09-29 13:07:13,670 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 13:07:13,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:13,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:07:13,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:07:15,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:07:16,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:07:17,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=374140.0, ans=0.0 2023-09-29 13:07:18,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:18,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 13:07:18,863 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=374140.0, ans=0.125 2023-09-29 13:07:20,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:23,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:07:23,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:07:28,279 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 13:07:28,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 13:07:29,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:07:31,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:07:31,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 13:07:31,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:36,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:07:37,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=374206.6666666667, ans=0.05 2023-09-29 13:07:46,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:07:52,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 13:07:52,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:07:54,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:07:56,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:56,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:07:57,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:07:57,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 13:08:00,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 13:08:01,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:08:02,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:08:04,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:08:04,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:07,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:07,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:08:10,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:08:10,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:08:10,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:08:13,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:15,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=374340.0, ans=0.125 2023-09-29 13:08:17,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 13:08:18,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:08:18,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:18,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:08:23,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:23,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:26,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:08:27,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 13:08:28,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:08:28,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 13:08:28,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:08:30,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 13:08:31,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=374406.6666666667, ans=0.1 2023-09-29 13:08:34,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:08:35,958 INFO [train.py:1039] (0/4) Epoch 11, batch 3050, loss[loss=0.2245, simple_loss=0.2833, pruned_loss=0.08292, over 22744.00 frames. ], tot_loss[loss=0.2039, simple_loss=0.2741, pruned_loss=0.06684, over 4698912.40 frames. ], batch size: 322, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:08:36,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:08:36,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 13:08:37,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 13:08:37,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:08:37,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=374473.3333333333, ans=0.0 2023-09-29 13:08:39,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:08:39,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:39,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:08:39,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:39,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:08:44,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 13:08:45,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:08:47,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:08:48,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:08:52,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:52,614 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=374540.0, ans=0.125 2023-09-29 13:08:56,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 13:09:02,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 13:09:02,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 13:09:02,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:06,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:09:07,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=374606.6666666667, ans=0.1 2023-09-29 13:09:09,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:09,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:10,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:12,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:12,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:09:13,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:13,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:13,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:14,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=374606.6666666667, ans=0.125 2023-09-29 13:09:14,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=374606.6666666667, ans=0.1 2023-09-29 13:09:15,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:18,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:22,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:23,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 13:09:24,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:24,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:09:27,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:09:27,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=374673.3333333333, ans=0.0 2023-09-29 13:09:29,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:09:29,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:09:29,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:34,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:34,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:42,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:42,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:09:42,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:42,832 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=374740.0, ans=0.125 2023-09-29 13:09:44,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=374740.0, ans=0.0 2023-09-29 13:09:45,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:45,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:09:47,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:47,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 13:09:49,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:49,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:50,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 13:09:52,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:57,386 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.035e+02 2.257e+02 2.557e+02 3.814e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-29 13:09:58,905 INFO [train.py:1039] (0/4) Epoch 11, batch 3100, loss[loss=0.2034, simple_loss=0.273, pruned_loss=0.06688, over 23485.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2738, pruned_loss=0.0667, over 4703048.54 frames. ], batch size: 106, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:09:59,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:00,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:10:03,133 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.31 vs. limit=15.0 2023-09-29 13:10:04,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:10:04,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 13:10:07,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 13:10:07,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 13:10:10,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:10:13,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:10:14,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:15,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:10:20,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:23,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=374873.3333333333, ans=0.1 2023-09-29 13:10:26,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 13:10:30,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:10:31,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:31,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:10:31,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:10:33,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:10:37,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:10:37,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 13:10:37,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:10:38,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:41,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 13:10:43,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:10:46,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:10:47,151 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=375006.6666666667, ans=0.125 2023-09-29 13:10:48,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 13:10:48,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 13:10:49,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:51,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:54,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:10:54,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:54,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:10:55,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:10:55,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:57,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:10:57,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:10:57,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:57,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:11:01,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:11:03,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 13:11:06,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:11:06,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 13:11:07,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:07,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:07,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 13:11:14,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=375073.3333333333, ans=0.0 2023-09-29 13:11:19,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 13:11:20,436 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.90 vs. limit=22.5 2023-09-29 13:11:21,154 INFO [train.py:1039] (0/4) Epoch 11, batch 3150, loss[loss=0.1897, simple_loss=0.2464, pruned_loss=0.06654, over 23436.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2715, pruned_loss=0.06689, over 4677538.42 frames. ], batch size: 285, lr: 9.40e-03, grad_scale: 16.0 2023-09-29 13:11:22,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:22,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:25,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:11:25,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:11:27,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 13:11:28,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:28,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:11:30,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 13:11:31,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:33,339 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 13:11:33,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=375140.0, ans=0.0 2023-09-29 13:11:37,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 13:11:37,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:11:40,256 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 13:11:40,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:11:41,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 13:11:43,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 13:11:43,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 13:11:43,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:43,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:11:43,632 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=375206.6666666667, ans=0.125 2023-09-29 13:11:44,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:47,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 13:11:50,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:50,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:52,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:11:52,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=375273.3333333333, ans=0.0 2023-09-29 13:11:53,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:11:56,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 13:11:57,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:11:58,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:11:58,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=375273.3333333333, ans=0.125 2023-09-29 13:12:00,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:12:00,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 13:12:01,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 13:12:03,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:12:03,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:12:03,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:12:04,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:04,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:12:07,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:12:08,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:12:08,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 13:12:10,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:12:10,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:10,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=375340.0, ans=0.5 2023-09-29 13:12:13,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:12:13,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:12:13,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 13:12:15,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:16,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 13:12:16,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:18,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 13:12:19,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 13:12:21,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:12:21,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:23,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 13:12:24,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 13:12:26,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:29,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:12:31,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:31,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:12:32,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=375406.6666666667, ans=0.125 2023-09-29 13:12:33,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=375406.6666666667, ans=0.125 2023-09-29 13:12:37,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:12:37,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:39,896 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.896e+02 2.240e+02 2.702e+02 3.896e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 13:12:40,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 13:12:41,473 INFO [train.py:1039] (0/4) Epoch 11, batch 3200, loss[loss=0.2276, simple_loss=0.3009, pruned_loss=0.0771, over 24047.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2705, pruned_loss=0.06656, over 4680938.67 frames. ], batch size: 80, lr: 9.40e-03, grad_scale: 32.0 2023-09-29 13:12:45,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:12:45,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 13:12:50,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:50,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:12:50,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 13:12:53,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:13:00,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:13:02,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=375540.0, ans=0.125 2023-09-29 13:13:03,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:13:10,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=375540.0, ans=0.125 2023-09-29 13:13:11,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:13:11,802 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=375540.0, ans=0.05 2023-09-29 13:13:21,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 13:13:23,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:13:25,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 13:13:26,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:13:29,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:13:30,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:13:31,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:13:34,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 13:13:35,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:13:38,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 13:13:41,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 13:13:43,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:13:49,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:49,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:13:49,343 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=375740.0, ans=0.0 2023-09-29 13:13:50,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:50,459 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 13:13:50,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:13:52,931 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=375740.0, ans=0.09899494936611666 2023-09-29 13:13:57,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:13:57,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 13:13:57,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=375740.0, ans=0.125 2023-09-29 13:13:58,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 13:14:00,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 13:14:01,216 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.17 vs. limit=15.0 2023-09-29 13:14:01,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 13:14:04,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:14:06,020 INFO [train.py:1039] (0/4) Epoch 11, batch 3250, loss[loss=0.2549, simple_loss=0.2918, pruned_loss=0.1091, over 19352.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2708, pruned_loss=0.06643, over 4676817.20 frames. ], batch size: 388, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:14:06,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:14:07,659 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 13:14:07,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:10,645 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 13:14:15,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:14:18,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:19,158 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.54 vs. limit=22.5 2023-09-29 13:14:20,677 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.26 vs. limit=15.0 2023-09-29 13:14:23,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=375873.3333333333, ans=0.2 2023-09-29 13:14:25,315 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=375873.3333333333, ans=0.1 2023-09-29 13:14:26,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:14:26,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 13:14:26,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:26,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:14:26,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:28,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:28,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:14:31,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:31,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:14:32,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:33,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:33,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:33,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:14:37,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:37,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:38,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:40,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:42,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:42,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:42,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:14:47,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 13:14:48,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:48,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:14:50,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:50,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:14:57,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:15:05,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:06,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:06,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 13:15:06,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:15:06,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:15:06,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:10,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 13:15:10,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 13:15:12,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:15:12,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:13,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:13,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:15:15,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:18,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:18,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:20,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=376073.3333333333, ans=0.0 2023-09-29 13:15:21,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 13:15:21,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:24,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:15:24,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 13:15:26,168 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.864e+02 2.159e+02 2.577e+02 4.318e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-29 13:15:27,697 INFO [train.py:1039] (0/4) Epoch 11, batch 3300, loss[loss=0.183, simple_loss=0.2601, pruned_loss=0.05298, over 24606.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2713, pruned_loss=0.06629, over 4685587.03 frames. ], batch size: 60, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:15:27,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:27,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 13:15:29,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 13:15:29,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 13:15:29,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:34,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:36,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:15:36,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:38,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:15:38,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:15:39,401 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.75 vs. limit=12.0 2023-09-29 13:15:42,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:43,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:48,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 13:15:48,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:15:48,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:50,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:51,674 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 13:15:52,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=376206.6666666667, ans=0.5 2023-09-29 13:15:53,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:15:53,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:15:55,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:15:55,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:15:55,373 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 13:15:58,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:58,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:16:01,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:01,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 13:16:03,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 13:16:03,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:05,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:16:06,962 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=376273.3333333333, ans=0.125 2023-09-29 13:16:08,189 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 13:16:08,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=376273.3333333333, ans=0.1 2023-09-29 13:16:08,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=376273.3333333333, ans=0.0 2023-09-29 13:16:09,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 13:16:09,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:16:12,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 13:16:14,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:19,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:16:19,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:22,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:22,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:22,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:16:22,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:16:22,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=376340.0, ans=0.0 2023-09-29 13:16:22,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=376340.0, ans=0.05 2023-09-29 13:16:24,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:16:24,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:25,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:16:27,901 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 13:16:29,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 13:16:30,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:16:30,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:16:30,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:33,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:33,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:34,228 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=376406.6666666667, ans=0.0 2023-09-29 13:16:36,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:16:36,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:36,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:16:38,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:40,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:16:43,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 13:16:43,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:46,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:46,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:16:48,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:50,026 INFO [train.py:1039] (0/4) Epoch 11, batch 3350, loss[loss=0.1866, simple_loss=0.2758, pruned_loss=0.04864, over 24566.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2721, pruned_loss=0.06633, over 4697734.79 frames. ], batch size: 71, lr: 9.38e-03, grad_scale: 32.0 2023-09-29 13:16:50,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:52,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:52,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:52,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=376473.3333333333, ans=0.015 2023-09-29 13:16:54,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:55,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:58,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:17:02,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:05,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:17:05,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:06,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:17:08,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 13:17:08,295 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 13:17:08,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:11,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 13:17:13,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 13:17:14,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.10 vs. limit=15.0 2023-09-29 13:17:14,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:17:14,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:17:16,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:16,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 13:17:16,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:17,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:17:19,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:19,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=376540.0, ans=0.0 2023-09-29 13:17:22,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:22,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:24,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:17:28,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:31,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:32,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:37,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:17:37,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:37,694 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=376673.3333333333, ans=0.125 2023-09-29 13:17:39,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=376673.3333333333, ans=0.0 2023-09-29 13:17:40,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:40,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:42,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:44,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 13:17:44,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:17:44,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 13:17:45,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:17:47,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 13:17:47,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=376673.3333333333, ans=0.2 2023-09-29 13:17:48,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:50,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:56,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:57,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 13:17:59,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:01,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:18:01,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:18:03,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=376740.0, ans=0.0 2023-09-29 13:18:05,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:08,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 13:18:08,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:18:09,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:18:10,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:10,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 13:18:11,967 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.933e+02 2.082e+02 2.375e+02 4.063e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 13:18:12,012 INFO [train.py:1039] (0/4) Epoch 11, batch 3400, loss[loss=0.2216, simple_loss=0.2755, pruned_loss=0.0839, over 22653.00 frames. ], tot_loss[loss=0.2011, simple_loss=0.2717, pruned_loss=0.06528, over 4711536.09 frames. ], batch size: 322, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:18:12,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:18:12,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 13:18:13,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:18:17,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:18:17,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 13:18:21,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=376806.6666666667, ans=0.125 2023-09-29 13:18:21,985 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=15.0 2023-09-29 13:18:22,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 13:18:22,634 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 13:18:22,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:18:27,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:27,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:27,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:28,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:18:35,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:18:37,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 13:18:40,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:18:42,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:42,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:43,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:18:51,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:18:55,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 13:19:02,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=377006.6666666667, ans=0.0 2023-09-29 13:19:03,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:04,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:04,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 13:19:04,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:07,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:07,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:19:07,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:19:08,212 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.66 vs. limit=12.0 2023-09-29 13:19:12,072 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.22 vs. limit=6.0 2023-09-29 13:19:12,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:19:15,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:19:15,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:19:21,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:24,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 13:19:30,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:19:35,261 INFO [train.py:1039] (0/4) Epoch 11, batch 3450, loss[loss=0.2095, simple_loss=0.2867, pruned_loss=0.06617, over 24620.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.2713, pruned_loss=0.06506, over 4715637.10 frames. ], batch size: 68, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:19:35,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 13:19:38,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 13:19:38,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=377140.0, ans=0.125 2023-09-29 13:19:40,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:41,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:19:41,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 13:19:43,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:49,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:19:53,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:19:54,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:19:55,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:19:55,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:57,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:05,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 13:20:10,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 13:20:10,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:20:12,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:20:13,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:19,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 13:20:21,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:20:26,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:27,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:20:29,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:20:29,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:20:31,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 13:20:31,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:32,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:35,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:20:37,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 13:20:38,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=377340.0, ans=0.09899494936611666 2023-09-29 13:20:41,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:20:44,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:20:47,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:48,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=377406.6666666667, ans=0.125 2023-09-29 13:20:51,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:20:52,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=377406.6666666667, ans=0.125 2023-09-29 13:20:56,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:56,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:58,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:20:58,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:59,828 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.022e+02 2.344e+02 2.789e+02 4.683e+02, threshold=4.688e+02, percent-clipped=2.0 2023-09-29 13:20:59,873 INFO [train.py:1039] (0/4) Epoch 11, batch 3500, loss[loss=0.1965, simple_loss=0.2665, pruned_loss=0.06326, over 23324.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2699, pruned_loss=0.06457, over 4716346.78 frames. ], batch size: 105, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:21:00,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:04,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:21:04,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 13:21:08,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:21:09,016 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.11 vs. limit=15.0 2023-09-29 13:21:11,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:21:12,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:12,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 13:21:20,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:21:20,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:21:21,094 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=377540.0, ans=0.1 2023-09-29 13:21:22,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:21:22,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:23,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:21:23,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:23,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:24,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 13:21:27,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:27,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:21:29,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:33,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:35,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 13:21:35,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:38,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:40,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:21:42,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:45,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:21:45,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:46,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 13:21:46,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 13:21:48,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 13:21:48,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:50,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:50,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:51,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:21:53,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=377673.3333333333, ans=0.0 2023-09-29 13:21:55,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:21:55,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:21:59,362 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=377673.3333333333, ans=0.0 2023-09-29 13:22:02,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:03,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 13:22:03,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 13:22:03,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:05,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:06,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:08,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:11,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 13:22:11,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:14,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:22:16,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 13:22:17,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 13:22:19,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:19,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:21,005 INFO [train.py:1039] (0/4) Epoch 11, batch 3550, loss[loss=0.1824, simple_loss=0.2496, pruned_loss=0.05762, over 23878.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2685, pruned_loss=0.06408, over 4724703.09 frames. ], batch size: 195, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:22:21,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:21,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:23,129 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=377806.6666666667, ans=0.125 2023-09-29 13:22:25,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:22:35,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:36,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 13:22:39,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:41,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:22:42,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:44,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:22:44,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:22:47,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:47,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:22:47,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:47,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:22:49,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:22:52,661 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=377940.0, ans=0.2 2023-09-29 13:22:53,033 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.22 vs. limit=15.0 2023-09-29 13:22:55,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:22:55,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:58,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:22:58,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:00,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:23:00,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 13:23:00,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:01,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:03,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:23:05,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=377940.0, ans=0.125 2023-09-29 13:23:10,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:10,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=378006.6666666667, ans=0.125 2023-09-29 13:23:12,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:23:13,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:13,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 13:23:15,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:23:16,231 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.59 vs. limit=10.0 2023-09-29 13:23:18,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 13:23:19,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:23:21,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:23:21,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:23:25,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 13:23:27,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:28,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=378073.3333333333, ans=0.125 2023-09-29 13:23:31,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:33,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 13:23:33,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:38,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:39,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 13:23:40,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=378073.3333333333, ans=0.125 2023-09-29 13:23:43,827 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.951e+02 2.213e+02 2.629e+02 3.694e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 13:23:43,871 INFO [train.py:1039] (0/4) Epoch 11, batch 3600, loss[loss=0.1979, simple_loss=0.289, pruned_loss=0.05335, over 24649.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2689, pruned_loss=0.06411, over 4726956.23 frames. ], batch size: 73, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:23:45,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 13:23:45,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:23:47,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:23:48,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=378140.0, ans=0.125 2023-09-29 13:23:49,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:49,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:50,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:23:53,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:55,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:57,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:23:57,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:23:59,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:59,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 13:24:03,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:24:05,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:07,736 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=378206.6666666667, ans=0.2 2023-09-29 13:24:08,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:12,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:13,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:24:13,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:24:13,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 13:24:15,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:18,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:21,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:24:22,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:24,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:25,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:24:25,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 13:24:35,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:24:36,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:24:37,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 13:24:39,378 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.36 vs. limit=22.5 2023-09-29 13:24:40,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=378340.0, ans=0.2 2023-09-29 13:24:41,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:24:45,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:48,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:56,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:24:56,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:24:56,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 13:24:57,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 13:24:59,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 13:25:02,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:25:02,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:25:03,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 13:25:03,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:03,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:25:03,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:04,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 13:25:06,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 13:25:07,542 INFO [train.py:1039] (0/4) Epoch 11, batch 3650, loss[loss=0.1683, simple_loss=0.2405, pruned_loss=0.04807, over 20377.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2699, pruned_loss=0.06458, over 4710966.12 frames. ], batch size: 44, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:25:07,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:25:07,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 13:25:08,680 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.47 vs. limit=15.0 2023-09-29 13:25:14,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 13:25:16,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:25:20,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 13:25:23,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 13:25:24,281 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=378540.0, ans=0.0 2023-09-29 13:25:25,814 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=378540.0, ans=0.09899494936611666 2023-09-29 13:25:27,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:25:27,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:25:27,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:25:31,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:25:32,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:32,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 13:25:33,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:25:34,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:34,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 13:25:36,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:25:37,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:25:37,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:37,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:25:39,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 13:25:41,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 13:25:43,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:25:44,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 13:25:46,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:25:46,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:25:51,399 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=378606.6666666667, ans=0.035 2023-09-29 13:25:54,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:25:56,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:56,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:25:56,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=378673.3333333333, ans=0.2 2023-09-29 13:25:57,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:25:59,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:26:02,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:26:04,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:06,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:06,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:26:06,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:26:08,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:26:09,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:15,139 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 13:26:19,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:19,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:20,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:26:21,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:21,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:26:23,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:24,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 13:26:24,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:26,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:26:28,275 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=378740.0, ans=0.125 2023-09-29 13:26:29,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:30,848 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.035e+02 2.257e+02 2.600e+02 3.794e+02, threshold=4.515e+02, percent-clipped=0.0 2023-09-29 13:26:30,893 INFO [train.py:1039] (0/4) Epoch 11, batch 3700, loss[loss=0.1968, simple_loss=0.2753, pruned_loss=0.05917, over 24653.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2716, pruned_loss=0.06554, over 4705750.66 frames. ], batch size: 68, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:26:31,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:26:31,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=378806.6666666667, ans=0.04949747468305833 2023-09-29 13:26:32,781 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:26:34,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:34,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 13:26:34,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:36,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:26:36,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:26:39,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:26:41,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:42,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:42,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:26:44,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:44,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:26:46,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:48,275 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 13:26:57,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:26:58,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:26:59,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:26:59,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 13:26:59,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:03,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=378940.0, ans=0.1 2023-09-29 13:27:03,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=378940.0, ans=0.125 2023-09-29 13:27:04,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:04,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 13:27:06,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=378940.0, ans=0.125 2023-09-29 13:27:07,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:10,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:27:13,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:13,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:27:14,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:27:19,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:19,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 13:27:19,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:27:19,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 13:27:23,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:27:25,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:27:28,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:28,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 13:27:31,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:27:31,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:27:31,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:31,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:31,493 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:27:34,385 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-09-29 13:27:35,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:36,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 13:27:38,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 13:27:38,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:27:38,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:27:40,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:27:41,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:27:44,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=379073.3333333333, ans=0.0 2023-09-29 13:27:46,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:49,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:27:50,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=379073.3333333333, ans=0.125 2023-09-29 13:27:51,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:27:53,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 13:27:55,182 INFO [train.py:1039] (0/4) Epoch 11, batch 3750, loss[loss=0.2148, simple_loss=0.2873, pruned_loss=0.07115, over 23294.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2716, pruned_loss=0.06512, over 4718814.74 frames. ], batch size: 105, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:27:55,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:27:57,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:27:58,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 13:27:58,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:28:00,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:01,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:03,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:06,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:09,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:28:11,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:28:13,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:28:15,421 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-09-29 13:28:16,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:18,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 13:28:20,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:20,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:21,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:23,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 13:28:28,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 13:28:28,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=379273.3333333333, ans=0.125 2023-09-29 13:28:30,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:31,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:33,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:38,130 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=379273.3333333333, ans=0.0 2023-09-29 13:28:39,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:42,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 13:28:44,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 13:28:48,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:53,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:53,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:28:58,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:29:02,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:29:03,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:29:05,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:29:05,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=379406.6666666667, ans=0.0 2023-09-29 13:29:07,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:29:08,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:29:12,327 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=379406.6666666667, ans=0.125 2023-09-29 13:29:16,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:29:18,115 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.004e+02 2.192e+02 2.441e+02 3.152e+02, threshold=4.385e+02, percent-clipped=0.0 2023-09-29 13:29:18,160 INFO [train.py:1039] (0/4) Epoch 11, batch 3800, loss[loss=0.1909, simple_loss=0.2648, pruned_loss=0.0585, over 24652.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2721, pruned_loss=0.06537, over 4725587.82 frames. ], batch size: 65, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:29:19,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:19,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:29:21,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 13:29:24,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:24,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:24,864 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.65 vs. limit=10.0 2023-09-29 13:29:25,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:29:27,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 13:29:27,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:29,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:29:29,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=379473.3333333333, ans=0.1 2023-09-29 13:29:30,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:32,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:29:32,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:34,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 13:29:39,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 13:29:39,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:29:40,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:45,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:29:45,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:29:47,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:29:47,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:49,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:51,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:56,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:29:56,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 13:29:58,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:30:05,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:10,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:30:12,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 13:30:14,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 13:30:15,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:17,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:30:17,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:18,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 13:30:23,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 13:30:23,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 13:30:24,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:26,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:32,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:30:33,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:30:34,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=379740.0, ans=0.1 2023-09-29 13:30:40,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:30:40,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 13:30:42,627 INFO [train.py:1039] (0/4) Epoch 11, batch 3850, loss[loss=0.1837, simple_loss=0.2503, pruned_loss=0.05851, over 23754.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.2708, pruned_loss=0.06509, over 4715452.35 frames. ], batch size: 149, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:30:42,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:30:42,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:46,804 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.06 vs. limit=15.0 2023-09-29 13:30:48,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:30:51,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:54,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:30:56,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 13:30:58,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=379873.3333333333, ans=0.09899494936611666 2023-09-29 13:31:01,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:05,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:31:06,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:06,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:31:10,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:12,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:31:12,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:12,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:31:13,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:16,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:18,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:18,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:31:18,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 13:31:18,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 13:31:19,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:19,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:21,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:22,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:23,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 13:31:26,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 13:31:28,191 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.54 vs. limit=22.5 2023-09-29 13:31:29,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:30,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 13:31:33,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:31:40,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:40,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:44,788 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.15 vs. limit=15.0 2023-09-29 13:31:45,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:45,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 13:31:49,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 13:31:50,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:50,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:55,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:31:55,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:31:55,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:31:56,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 13:31:58,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:32:01,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 13:32:01,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:01,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:04,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:32:05,645 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.057e+02 2.295e+02 2.790e+02 3.822e+02, threshold=4.589e+02, percent-clipped=0.0 2023-09-29 13:32:05,689 INFO [train.py:1039] (0/4) Epoch 11, batch 3900, loss[loss=0.1918, simple_loss=0.2678, pruned_loss=0.0579, over 23490.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2686, pruned_loss=0.0645, over 4706151.97 frames. ], batch size: 106, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:32:05,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:07,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:32:07,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:32:09,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:10,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 13:32:10,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:15,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:16,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:16,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:32:16,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:20,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:20,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:23,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:32:24,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 13:32:25,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:27,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 13:32:27,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:29,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 13:32:29,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 13:32:33,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=380206.6666666667, ans=0.0 2023-09-29 13:32:35,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:37,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:37,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:32:37,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:32:40,105 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.05 vs. limit=15.0 2023-09-29 13:32:40,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:42,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:32:44,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:32:44,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:32:45,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:32:52,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:52,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:33:02,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:33:04,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:33:13,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:17,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:17,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 13:33:17,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 13:33:17,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:17,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=380406.6666666667, ans=0.1 2023-09-29 13:33:20,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 13:33:21,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:33:23,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 13:33:28,105 INFO [train.py:1039] (0/4) Epoch 11, batch 3950, loss[loss=0.2136, simple_loss=0.2731, pruned_loss=0.07703, over 23789.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.269, pruned_loss=0.06387, over 4715767.14 frames. ], batch size: 164, lr: 9.34e-03, grad_scale: 16.0 2023-09-29 13:33:29,121 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.02 vs. limit=12.0 2023-09-29 13:33:29,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:33:31,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 13:33:31,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:33:34,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:33:36,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:33:45,422 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 13:33:45,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:45,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 13:33:47,066 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 13:33:48,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:50,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=380540.0, ans=0.0 2023-09-29 13:33:51,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:51,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:33:51,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:54,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 13:33:57,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:33:59,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:59,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:34:00,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:34:00,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:34:09,342 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=380606.6666666667, ans=0.5 2023-09-29 13:34:12,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:34:14,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:34:19,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 13:34:24,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=380673.3333333333, ans=0.0 2023-09-29 13:34:25,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 13:34:25,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 13:34:25,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:34:26,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=380673.3333333333, ans=0.125 2023-09-29 13:34:27,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:34:36,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:34:36,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:34:36,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:34:36,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:34:36,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 13:34:42,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:34:43,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:34:47,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 13:34:47,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=380740.0, ans=0.125 2023-09-29 13:34:50,668 INFO [train.py:1039] (0/4) Epoch 11, batch 4000, loss[loss=0.1935, simple_loss=0.2556, pruned_loss=0.06567, over 23713.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2707, pruned_loss=0.0644, over 4714306.08 frames. ], batch size: 149, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:34:52,648 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.905e+02 2.140e+02 2.457e+02 3.925e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 13:34:58,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:05,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:10,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:10,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:35:10,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:10,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 13:35:12,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:35:12,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 13:35:12,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:35:12,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 13:35:14,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:14,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=380873.3333333333, ans=0.0 2023-09-29 13:35:15,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=380873.3333333333, ans=0.125 2023-09-29 13:35:18,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:35:18,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:35:18,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:35:18,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:18,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:35:20,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:35:22,450 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 13:35:22,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:35:22,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:26,408 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 13:35:26,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:35:26,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:28,382 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:35:34,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 13:35:34,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:37,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:35:38,831 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 13:35:40,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:35:40,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 13:35:40,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:35:42,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:42,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:35:45,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:35:45,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:35:46,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:47,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 13:35:48,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:50,610 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 13:35:50,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=381006.6666666667, ans=0.0 2023-09-29 13:35:53,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:35:57,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:35:59,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:35:59,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:59,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=381073.3333333333, ans=0.125 2023-09-29 13:36:00,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:36:02,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:05,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=381073.3333333333, ans=0.0 2023-09-29 13:36:06,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:36:09,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:36:10,654 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=15.0 2023-09-29 13:36:11,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 13:36:13,375 INFO [train.py:1039] (0/4) Epoch 11, batch 4050, loss[loss=0.2437, simple_loss=0.2952, pruned_loss=0.09606, over 22664.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2707, pruned_loss=0.06461, over 4714041.00 frames. ], batch size: 322, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:36:13,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:36:13,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:14,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:36:16,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:19,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:22,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:25,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:36:26,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:36:28,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:36:28,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:36:32,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:35,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:35,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=381206.6666666667, ans=0.125 2023-09-29 13:36:38,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 13:36:40,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 13:36:40,124 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 13:36:41,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:36:51,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 13:36:52,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:36:56,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:59,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:59,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:36:59,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:37:03,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:37:04,247 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.35 vs. limit=6.0 2023-09-29 13:37:07,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 13:37:07,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:37:09,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:09,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=381340.0, ans=0.1 2023-09-29 13:37:12,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 13:37:16,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:26,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 13:37:26,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:37:26,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:37:29,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 13:37:29,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 13:37:29,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:30,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=381406.6666666667, ans=0.125 2023-09-29 13:37:32,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:37:34,405 INFO [train.py:1039] (0/4) Epoch 11, batch 4100, loss[loss=0.1932, simple_loss=0.2644, pruned_loss=0.06104, over 24628.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2713, pruned_loss=0.06458, over 4714823.90 frames. ], batch size: 60, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:37:34,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:34,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:37:35,618 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.41 vs. limit=15.0 2023-09-29 13:37:35,983 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.063e+02 2.315e+02 3.202e+02 5.550e+02, threshold=4.630e+02, percent-clipped=7.0 2023-09-29 13:37:41,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 13:37:43,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 13:37:45,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 13:37:46,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 13:37:46,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:48,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:37:49,956 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 13:37:52,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:37:54,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:37:54,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:55,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:37:59,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:38:01,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:38:01,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:38:01,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 13:38:02,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:02,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:38:02,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:02,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:38:04,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 13:38:08,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:08,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 13:38:10,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:38:14,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:14,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 13:38:16,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:38:17,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:38:17,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:38:19,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 13:38:19,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=381606.6666666667, ans=0.0 2023-09-29 13:38:21,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:38:21,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:38:24,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 13:38:24,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:24,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:38:27,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:34,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:38:37,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:39,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:38:47,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:38:47,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:50,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:53,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:38:57,282 INFO [train.py:1039] (0/4) Epoch 11, batch 4150, loss[loss=0.2077, simple_loss=0.2702, pruned_loss=0.0726, over 23623.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2714, pruned_loss=0.06432, over 4716656.70 frames. ], batch size: 135, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:38:58,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:39:00,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:39:02,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:39:02,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:05,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 13:39:06,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:06,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 13:39:06,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 13:39:06,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 13:39:08,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:12,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=381873.3333333333, ans=0.1 2023-09-29 13:39:14,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:39:14,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:15,898 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.93 vs. limit=22.5 2023-09-29 13:39:18,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:19,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:39:21,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:39:23,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:39:23,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:25,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:39:29,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:34,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:35,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 13:39:37,594 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.83 vs. limit=15.0 2023-09-29 13:39:38,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 13:39:38,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:39:39,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 13:39:39,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:39:39,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:39:44,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:39:44,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:48,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 13:39:51,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:39:51,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:39:52,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 13:39:53,228 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=382006.6666666667, ans=0.125 2023-09-29 13:39:54,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:54,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=382006.6666666667, ans=0.125 2023-09-29 13:39:56,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 13:39:59,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:40:00,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:40:02,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:03,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=382073.3333333333, ans=15.0 2023-09-29 13:40:03,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 13:40:03,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:03,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:40:04,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:40:07,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 13:40:07,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:07,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:40:08,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:40:09,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 13:40:09,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:40:09,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:40:10,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:40:12,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:12,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 13:40:12,970 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.15 vs. limit=10.0 2023-09-29 13:40:14,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:40:18,767 INFO [train.py:1039] (0/4) Epoch 11, batch 4200, loss[loss=0.2014, simple_loss=0.2563, pruned_loss=0.07323, over 23539.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2704, pruned_loss=0.06398, over 4713428.68 frames. ], batch size: 256, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:40:20,248 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.930e+02 2.193e+02 2.587e+02 4.330e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-29 13:40:20,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:40:21,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 13:40:24,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:40:26,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:26,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:40:28,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:28,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:30,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 13:40:30,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=382140.0, ans=0.125 2023-09-29 13:40:33,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 13:40:35,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:36,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:38,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:40:38,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=382206.6666666667, ans=0.0 2023-09-29 13:40:41,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:40:44,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:40:44,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:44,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 13:40:44,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:46,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:47,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:48,098 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.74 vs. limit=15.0 2023-09-29 13:40:48,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:40:48,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:40:48,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=382206.6666666667, ans=0.1 2023-09-29 13:40:51,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 13:40:51,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:55,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:40:56,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:41:00,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:41:00,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:05,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:41:05,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 13:41:05,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:05,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:41:05,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=382273.3333333333, ans=0.125 2023-09-29 13:41:11,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:41:14,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:22,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:41:24,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 13:41:27,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:29,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=382406.6666666667, ans=0.0 2023-09-29 13:41:30,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:41:32,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:34,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 13:41:38,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:41:41,373 INFO [train.py:1039] (0/4) Epoch 11, batch 4250, loss[loss=0.1706, simple_loss=0.2454, pruned_loss=0.04793, over 24576.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2694, pruned_loss=0.06306, over 4725272.20 frames. ], batch size: 60, lr: 9.31e-03, grad_scale: 32.0 2023-09-29 13:41:41,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=382473.3333333333, ans=0.125 2023-09-29 13:41:44,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:44,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:41:48,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:52,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=382473.3333333333, ans=0.1 2023-09-29 13:41:53,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:41:53,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 13:41:55,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:57,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:59,870 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=382540.0, ans=0.125 2023-09-29 13:42:01,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:03,609 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.26 vs. limit=6.0 2023-09-29 13:42:06,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:07,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:07,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:42:07,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:08,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=382540.0, ans=0.125 2023-09-29 13:42:10,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:10,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:10,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:14,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:42:14,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:16,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 13:42:20,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 13:42:20,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:22,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:22,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:23,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:42:25,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:25,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:28,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 13:42:31,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:42:36,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:42:38,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:38,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 13:42:38,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=382673.3333333333, ans=0.0 2023-09-29 13:42:39,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:42:41,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 13:42:42,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:42:43,648 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.02 vs. limit=10.0 2023-09-29 13:42:44,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:42:44,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:44,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:48,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 13:42:49,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:42:51,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:42:54,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:56,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:57,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:42:59,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:00,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:02,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:43:03,988 INFO [train.py:1039] (0/4) Epoch 11, batch 4300, loss[loss=0.2102, simple_loss=0.2772, pruned_loss=0.07158, over 23246.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.269, pruned_loss=0.06287, over 4727026.23 frames. ], batch size: 105, lr: 9.31e-03, grad_scale: 16.0 2023-09-29 13:43:04,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:04,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 13:43:05,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:07,086 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.989e+02 2.378e+02 2.757e+02 5.301e+02, threshold=4.756e+02, percent-clipped=4.0 2023-09-29 13:43:12,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:12,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:17,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:17,863 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=382806.6666666667, ans=0.125 2023-09-29 13:43:23,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:43:23,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 13:43:25,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:43:27,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:43:27,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:43:28,841 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 13:43:30,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=382873.3333333333, ans=0.125 2023-09-29 13:43:33,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:43:33,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:43:35,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 13:43:37,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:43:37,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 13:43:40,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:43:42,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:43:44,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:43:44,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:44,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=382940.0, ans=0.5 2023-09-29 13:43:46,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:43:47,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:47,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:47,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 13:43:49,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 13:43:52,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:54,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:54,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:43:54,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:56,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:56,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 13:43:56,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 13:43:56,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 13:43:56,904 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=12.0 2023-09-29 13:43:57,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:43:57,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 13:43:59,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 13:44:02,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:05,352 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 13:44:06,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:44:08,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:09,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:13,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 13:44:13,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:44:13,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:14,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:14,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:14,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:44:18,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:44:21,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:23,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:23,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:26,388 INFO [train.py:1039] (0/4) Epoch 11, batch 4350, loss[loss=0.2069, simple_loss=0.2666, pruned_loss=0.07363, over 23779.00 frames. ], tot_loss[loss=0.198, simple_loss=0.2698, pruned_loss=0.06312, over 4727705.79 frames. ], batch size: 179, lr: 9.30e-03, grad_scale: 16.0 2023-09-29 13:44:30,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 13:44:31,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:44:34,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:44:34,976 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=383140.0, ans=0.0 2023-09-29 13:44:36,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=383140.0, ans=0.0 2023-09-29 13:44:37,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:40,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:44:40,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:44:40,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=383206.6666666667, ans=0.125 2023-09-29 13:44:41,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=383206.6666666667, ans=0.0 2023-09-29 13:44:41,353 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.52 vs. limit=15.0 2023-09-29 13:44:45,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:44:49,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:51,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=383206.6666666667, ans=0.125 2023-09-29 13:44:52,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:44:52,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:54,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:44:57,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:44:59,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:45:07,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 13:45:08,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:08,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:13,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:16,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 13:45:19,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:21,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:45:25,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=383340.0, ans=0.125 2023-09-29 13:45:26,542 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 13:45:28,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:28,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:45:29,748 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 13:45:29,866 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 13:45:29,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:29,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:31,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:45:32,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:34,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:34,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:45:37,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 13:45:37,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:37,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 13:45:39,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 13:45:39,318 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 13:45:40,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 13:45:43,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:45:43,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:45:43,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:45:45,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:45:45,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 13:45:47,207 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 13:45:47,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:48,654 INFO [train.py:1039] (0/4) Epoch 11, batch 4400, loss[loss=0.2078, simple_loss=0.2891, pruned_loss=0.06323, over 24382.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2713, pruned_loss=0.06404, over 4724834.26 frames. ], batch size: 77, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:45:50,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:45:50,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:51,546 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.43 vs. limit=5.0 2023-09-29 13:45:51,786 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.226e+02 2.866e+02 4.775e+02, threshold=4.452e+02, percent-clipped=1.0 2023-09-29 13:45:52,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:55,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 13:45:55,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 13:45:55,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 13:45:57,231 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 13:45:57,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:45:57,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:46:00,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 13:46:02,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:04,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:04,208 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 13:46:08,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:08,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 13:46:09,934 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 13:46:13,536 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.43 vs. limit=12.0 2023-09-29 13:46:14,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 13:46:14,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 13:46:15,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 13:46:15,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:17,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:19,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:20,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:20,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 13:46:20,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 13:46:22,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:23,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:46:23,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:25,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:26,412 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.83 vs. limit=15.0 2023-09-29 13:46:27,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:27,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 13:46:28,467 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 13:46:28,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=383606.6666666667, ans=0.0 2023-09-29 13:46:31,732 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=383606.6666666667, ans=0.0 2023-09-29 13:46:33,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:40,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:41,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 13:46:46,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:46:48,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:46:52,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:46:52,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 13:46:54,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:46:54,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:46:54,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:46:54,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:46:59,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 13:47:02,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 13:47:02,800 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.93 vs. limit=15.0 2023-09-29 13:47:03,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 13:47:03,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:03,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 13:47:03,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:47:07,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:47:09,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 13:47:12,122 INFO [train.py:1039] (0/4) Epoch 11, batch 4450, loss[loss=0.2139, simple_loss=0.2797, pruned_loss=0.07407, over 23552.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.272, pruned_loss=0.06437, over 4728705.82 frames. ], batch size: 134, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:47:12,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:47:16,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:17,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:47:18,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=383806.6666666667, ans=0.1 2023-09-29 13:47:23,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:47:23,538 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=383806.6666666667, ans=0.0 2023-09-29 13:47:24,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:47:26,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:29,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:47:30,215 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.31 vs. limit=15.0 2023-09-29 13:47:31,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:47:31,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:32,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 13:47:32,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:34,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:34,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:47:35,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:47:38,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:47:41,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=383873.3333333333, ans=0.125 2023-09-29 13:47:45,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:45,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:48,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:49,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:49,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:47:57,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:47:57,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 13:47:58,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 13:47:58,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:48:02,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:02,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 13:48:06,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:48:09,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:10,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=384006.6666666667, ans=0.0 2023-09-29 13:48:11,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 13:48:11,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:11,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:11,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:48:12,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:13,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:16,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:48:16,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 13:48:19,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:48:21,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:48:23,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:23,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:25,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:48:27,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:48:30,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 13:48:32,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:48:36,511 INFO [train.py:1039] (0/4) Epoch 11, batch 4500, loss[loss=0.2193, simple_loss=0.2965, pruned_loss=0.07109, over 24124.00 frames. ], tot_loss[loss=0.201, simple_loss=0.2726, pruned_loss=0.06469, over 4727015.44 frames. ], batch size: 80, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:48:38,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:39,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 13:48:39,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 13:48:41,278 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.027e+02 2.276e+02 2.770e+02 4.229e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 13:48:41,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:48:47,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:48,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:49,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:48:50,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:48:50,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:48:50,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:49:01,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=384206.6666666667, ans=0.2 2023-09-29 13:49:03,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:49:05,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:49:07,828 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=6.73 vs. limit=15.0 2023-09-29 13:49:08,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:09,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:49:09,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:49:17,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:49:20,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:49:22,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=384273.3333333333, ans=0.0 2023-09-29 13:49:25,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:49:25,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=384340.0, ans=0.05 2023-09-29 13:49:28,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:49:28,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 13:49:30,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:30,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:36,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:49:37,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 13:49:37,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:49:37,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:42,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:49:42,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:49:44,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:47,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:49:47,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:49:50,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 13:49:51,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 13:49:51,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 13:49:56,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 13:49:58,058 INFO [train.py:1039] (0/4) Epoch 11, batch 4550, loss[loss=0.1903, simple_loss=0.2653, pruned_loss=0.05766, over 24512.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.2716, pruned_loss=0.06471, over 4716722.28 frames. ], batch size: 63, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:49:58,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 13:49:59,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:03,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:04,208 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.59 vs. limit=15.0 2023-09-29 13:50:04,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:07,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:10,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:50:14,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:50:17,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:17,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:50:17,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:20,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:21,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:24,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:50:27,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 13:50:27,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 13:50:29,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:50:32,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 13:50:35,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 13:50:35,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:40,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 13:50:42,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:50:45,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:46,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:46,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:50:49,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 13:50:51,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:50:52,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:52,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:55,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:57,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 13:50:57,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 13:50:58,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:50:58,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 13:51:01,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 13:51:01,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:51:03,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:03,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:04,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:04,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:51:05,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:51:06,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 13:51:08,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:51:08,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:51:08,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 13:51:08,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:51:08,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 13:51:11,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=384740.0, ans=0.1 2023-09-29 13:51:12,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:51:12,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:51:14,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:51:14,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:14,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:51:18,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:51:18,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:51:21,626 INFO [train.py:1039] (0/4) Epoch 11, batch 4600, loss[loss=0.1906, simple_loss=0.2549, pruned_loss=0.06313, over 23640.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2707, pruned_loss=0.06463, over 4720849.82 frames. ], batch size: 232, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:51:21,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:23,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:25,952 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.102e+02 2.367e+02 2.907e+02 4.657e+02, threshold=4.735e+02, percent-clipped=1.0 2023-09-29 13:51:26,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:51:26,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:51:27,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:29,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 13:51:30,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:51:35,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:51:36,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:40,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:46,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 13:51:48,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:51,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:55,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:51:55,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:52:01,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 13:52:01,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:52:02,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:07,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:07,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:52:08,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:52:13,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 13:52:13,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:52:20,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:20,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:52:22,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:22,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 13:52:22,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:23,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 13:52:23,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:25,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:27,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:27,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:52:29,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:30,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 13:52:30,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 13:52:32,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 13:52:32,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:33,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:35,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:35,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:42,916 INFO [train.py:1039] (0/4) Epoch 11, batch 4650, loss[loss=0.2177, simple_loss=0.2943, pruned_loss=0.07054, over 24546.00 frames. ], tot_loss[loss=0.1989, simple_loss=0.2693, pruned_loss=0.06423, over 4718595.77 frames. ], batch size: 71, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:52:46,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:52:49,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:50,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:50,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:52:52,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:52,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:52,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:58,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 13:53:01,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:53:05,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 13:53:05,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:53:05,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 13:53:05,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:53:06,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 13:53:06,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 13:53:06,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:06,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:53:09,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:53:11,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:11,470 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 13:53:14,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:16,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 13:53:19,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:19,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:53:20,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 13:53:22,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:53:25,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:53:29,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:34,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:38,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:40,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:40,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:53:41,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 13:53:41,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 13:53:43,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 13:53:43,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 13:53:45,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:53:51,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:53:51,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:53:51,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 13:53:51,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:52,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:52,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:53:54,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:53:57,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:53:57,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:59,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:54:02,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:02,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:54:04,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:54:04,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 13:54:06,342 INFO [train.py:1039] (0/4) Epoch 11, batch 4700, loss[loss=0.2003, simple_loss=0.2659, pruned_loss=0.06741, over 23804.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2696, pruned_loss=0.06438, over 4715980.69 frames. ], batch size: 179, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:54:06,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:54:06,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 13:54:06,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=385473.3333333333, ans=0.125 2023-09-29 13:54:11,822 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.872e+02 2.042e+02 2.233e+02 3.363e+02, threshold=4.084e+02, percent-clipped=0.0 2023-09-29 13:54:13,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:13,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=385473.3333333333, ans=0.0 2023-09-29 13:54:15,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:16,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:54:17,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:20,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:54:25,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 13:54:27,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 13:54:30,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:30,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:54:31,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:54:34,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=385540.0, ans=0.125 2023-09-29 13:54:36,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=385540.0, ans=0.125 2023-09-29 13:54:37,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:44,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:54:44,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=385606.6666666667, ans=0.125 2023-09-29 13:54:45,089 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.87 vs. limit=15.0 2023-09-29 13:54:45,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:54:47,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:53,680 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.04 vs. limit=15.0 2023-09-29 13:54:55,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 13:54:55,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:54:58,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:00,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 13:55:03,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:07,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:55:07,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 13:55:08,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:08,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:12,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:55:12,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:55:12,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 13:55:12,913 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 13:55:14,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:14,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 13:55:16,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:20,487 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.37 vs. limit=15.0 2023-09-29 13:55:22,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 13:55:26,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:55:27,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:29,233 INFO [train.py:1039] (0/4) Epoch 11, batch 4750, loss[loss=0.1977, simple_loss=0.2817, pruned_loss=0.05681, over 24482.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2705, pruned_loss=0.06443, over 4719336.10 frames. ], batch size: 66, lr: 9.27e-03, grad_scale: 16.0 2023-09-29 13:55:32,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:32,551 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=385806.6666666667, ans=0.125 2023-09-29 13:55:33,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:55:35,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 13:55:35,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:55:39,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 13:55:40,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:55:40,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:43,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:47,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 13:55:51,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:55:51,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=385873.3333333333, ans=0.2 2023-09-29 13:55:53,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 13:55:54,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:56,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:57,941 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 13:55:57,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 13:56:01,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=385940.0, ans=0.2 2023-09-29 13:56:04,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 13:56:09,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:12,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:14,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:56:14,492 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 13:56:14,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:18,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:56:21,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:56:21,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 13:56:23,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 13:56:23,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:56:23,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:56:24,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:24,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:56:26,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 13:56:27,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 13:56:31,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:32,279 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.33 vs. limit=22.5 2023-09-29 13:56:33,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:56:33,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 13:56:33,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=386006.6666666667, ans=0.125 2023-09-29 13:56:34,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=386006.6666666667, ans=0.1 2023-09-29 13:56:35,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:56:36,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:38,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:56:39,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:39,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:56:42,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:44,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 13:56:44,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 13:56:46,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 13:56:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:56:49,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:50,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 13:56:50,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=386073.3333333333, ans=0.1 2023-09-29 13:56:52,197 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.69 vs. limit=10.0 2023-09-29 13:56:52,870 INFO [train.py:1039] (0/4) Epoch 11, batch 4800, loss[loss=0.218, simple_loss=0.2747, pruned_loss=0.08067, over 23751.00 frames. ], tot_loss[loss=0.2011, simple_loss=0.2717, pruned_loss=0.06525, over 4715490.32 frames. ], batch size: 232, lr: 9.27e-03, grad_scale: 32.0 2023-09-29 13:56:54,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:54,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:57,599 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.978e+02 2.285e+02 2.567e+02 3.711e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 13:56:59,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:57:01,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:01,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:02,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 13:57:03,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:57:04,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:57:06,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:57:09,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:12,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:12,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:57:14,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:14,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:57:15,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:17,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:18,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:22,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:57:26,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:57:28,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:28,689 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=386273.3333333333, ans=0.0 2023-09-29 13:57:28,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=386273.3333333333, ans=0.125 2023-09-29 13:57:30,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 13:57:30,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 13:57:31,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:57:33,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:57:33,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:33,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:57:36,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:57:36,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:41,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:41,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:43,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:57:48,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 13:57:48,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:49,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:49,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:57:49,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:54,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:56,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:57:56,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:56,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:57:57,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:57:58,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:57:58,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=386406.6666666667, ans=0.0 2023-09-29 13:58:02,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:02,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:02,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:58:04,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 13:58:07,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 13:58:07,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:07,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:08,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:08,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:12,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:58:14,017 INFO [train.py:1039] (0/4) Epoch 11, batch 4850, loss[loss=0.2108, simple_loss=0.2924, pruned_loss=0.06457, over 24002.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2728, pruned_loss=0.06607, over 4715445.71 frames. ], batch size: 80, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:58:17,296 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.56 vs. limit=15.0 2023-09-29 13:58:22,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 13:58:23,249 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=12.0 2023-09-29 13:58:24,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:27,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:27,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:58:27,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:33,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:33,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:58:34,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:58:34,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 13:58:40,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:43,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:58:43,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:58:45,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:58:45,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 13:58:47,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:47,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:52,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:52,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 13:58:52,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 13:58:53,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:59:01,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:59:01,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 13:59:02,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:59:02,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:59:06,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:59:08,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 13:59:08,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:09,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 13:59:09,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:11,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:12,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 13:59:22,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:27,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=386740.0, ans=0.125 2023-09-29 13:59:30,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:59:30,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:36,687 INFO [train.py:1039] (0/4) Epoch 11, batch 4900, loss[loss=0.1868, simple_loss=0.2675, pruned_loss=0.05299, over 24671.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.2715, pruned_loss=0.0647, over 4722510.54 frames. ], batch size: 65, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:59:36,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 13:59:36,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:59:41,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:42,355 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=386806.6666666667, ans=0.125 2023-09-29 13:59:43,966 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.984e+02 2.247e+02 2.564e+02 4.606e+02, threshold=4.494e+02, percent-clipped=1.0 2023-09-29 13:59:44,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:44,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:59:44,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=386806.6666666667, ans=0.125 2023-09-29 13:59:47,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 13:59:51,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 13:59:56,272 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=15.95 vs. limit=15.0 2023-09-29 13:59:56,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 13:59:57,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 13:59:58,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:59:58,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:58,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:59:58,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:58,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:00:00,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 14:00:03,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 14:00:04,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:00:06,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:00:06,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=386873.3333333333, ans=0.0 2023-09-29 14:00:08,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:00:09,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:00:09,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:11,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:11,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 14:00:13,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:00:14,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:00:16,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 14:00:16,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 14:00:19,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 14:00:21,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:00:21,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:00:21,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:00:23,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:23,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:00:23,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:00:24,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 14:00:27,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:29,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:00:31,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:00:34,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 14:00:34,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:00:36,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:00:37,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 14:00:45,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:47,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:00:47,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=387073.3333333333, ans=0.125 2023-09-29 14:00:49,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 14:00:49,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:00:49,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:00:51,872 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.85 vs. limit=15.0 2023-09-29 14:00:54,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:57,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:00:57,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:00:59,088 INFO [train.py:1039] (0/4) Epoch 11, batch 4950, loss[loss=0.2046, simple_loss=0.2818, pruned_loss=0.06369, over 24524.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.2693, pruned_loss=0.06446, over 4716199.55 frames. ], batch size: 71, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 14:00:59,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:59,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 14:01:00,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:01:03,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:03,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:01:08,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 14:01:08,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 14:01:10,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:01:10,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 14:01:10,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:10,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:01:11,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:01:11,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:14,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:14,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:01:16,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:01:16,680 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=387206.6666666667, ans=10.0 2023-09-29 14:01:18,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:20,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:20,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:01:24,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:01:29,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:32,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:01:33,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:33,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:36,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:01:36,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 14:01:38,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 14:01:38,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=387273.3333333333, ans=0.1 2023-09-29 14:01:41,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:43,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:01:43,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:01:43,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:01:44,031 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:01:45,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:01:45,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:01:48,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:50,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:01:51,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:01:53,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:53,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:55,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 14:01:55,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:01:56,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:02:01,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:03,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:02:03,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:02:03,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:05,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:02:05,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:02:08,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:02:09,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:02:09,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:02:10,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 14:02:16,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:21,068 INFO [train.py:1039] (0/4) Epoch 11, batch 5000, loss[loss=0.2287, simple_loss=0.3005, pruned_loss=0.07842, over 23986.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2692, pruned_loss=0.06419, over 4722247.24 frames. ], batch size: 80, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:02:21,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 14:02:21,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:02:21,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=387473.3333333333, ans=0.125 2023-09-29 14:02:25,380 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.49 vs. limit=15.0 2023-09-29 14:02:27,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:27,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:29,464 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.930e+02 2.192e+02 2.539e+02 4.135e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 14:02:29,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 14:02:29,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 14:02:31,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:02:35,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 14:02:35,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=387473.3333333333, ans=0.125 2023-09-29 14:02:37,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:02:37,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:02:37,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 14:02:37,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:38,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:02:38,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 14:02:38,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:38,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=387540.0, ans=0.1 2023-09-29 14:02:40,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:02:40,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 14:02:41,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 14:02:41,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:02:43,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 14:02:43,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:02:43,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:43,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:02:43,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 14:02:43,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 14:02:45,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=387540.0, ans=0.125 2023-09-29 14:02:47,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 14:02:47,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:47,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:50,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 14:02:50,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:50,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:51,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:53,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:02:55,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 14:02:56,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:57,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:03:01,834 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 14:03:05,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:03:06,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:03:06,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:07,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=387606.6666666667, ans=0.09899494936611666 2023-09-29 14:03:09,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 14:03:10,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:03:10,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:10,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:13,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 14:03:15,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:21,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=387673.3333333333, ans=0.0 2023-09-29 14:03:25,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 14:03:28,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:39,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:39,561 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=387740.0, ans=0.2 2023-09-29 14:03:40,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:40,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:03:40,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:40,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:03:40,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:03:42,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:45,667 INFO [train.py:1039] (0/4) Epoch 11, batch 5050, loss[loss=0.1867, simple_loss=0.2573, pruned_loss=0.05804, over 20732.00 frames. ], tot_loss[loss=0.1987, simple_loss=0.2692, pruned_loss=0.06406, over 4729459.96 frames. ], batch size: 45, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:03:47,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:47,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 14:03:48,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:03:50,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:50,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:03:52,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 14:03:54,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:54,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:56,642 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.88 vs. limit=15.0 2023-09-29 14:03:57,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:03:58,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:03:59,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:04:08,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 14:04:08,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:04:09,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:11,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 14:04:11,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:14,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:14,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:04:16,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:04:16,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 14:04:16,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 14:04:17,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:19,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:22,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:24,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 14:04:25,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:29,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 14:04:30,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:04:32,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:04:32,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:33,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:35,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:04:37,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:04:38,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:38,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:04:38,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:04:39,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 14:04:40,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:04:41,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:46,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:46,692 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 14:04:46,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:04:48,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:04:51,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:51,123 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 14:04:54,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:54,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 14:04:54,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:58,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:58,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:58,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 14:05:02,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 14:05:03,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=388073.3333333333, ans=0.0 2023-09-29 14:05:05,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:05,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:06,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:05:08,073 INFO [train.py:1039] (0/4) Epoch 11, batch 5100, loss[loss=0.1628, simple_loss=0.2352, pruned_loss=0.04516, over 24327.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.27, pruned_loss=0.06405, over 4725911.08 frames. ], batch size: 56, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:05:08,257 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 14:05:11,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:05:14,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 14:05:14,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 14:05:15,752 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.127e+02 2.436e+02 3.285e+02, threshold=4.254e+02, percent-clipped=0.0 2023-09-29 14:05:15,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:16,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=388140.0, ans=0.125 2023-09-29 14:05:17,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:05:20,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:05:22,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 14:05:22,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 14:05:29,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:05:29,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:05:33,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:35,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=388206.6666666667, ans=0.125 2023-09-29 14:05:36,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 14:05:36,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:38,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:05:38,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 14:05:41,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 14:05:44,696 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 14:05:46,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:46,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 14:05:46,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 14:05:49,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:57,978 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=388340.0, ans=0.07 2023-09-29 14:05:59,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:05:59,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=388340.0, ans=0.125 2023-09-29 14:06:03,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 14:06:03,536 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 14:06:03,560 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 14:06:05,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 14:06:05,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:06:08,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 14:06:12,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 14:06:16,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 14:06:17,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:06:19,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 14:06:22,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:06:23,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 14:06:27,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:06:28,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:06:28,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:06:30,038 INFO [train.py:1039] (0/4) Epoch 11, batch 5150, loss[loss=0.2176, simple_loss=0.2826, pruned_loss=0.07625, over 23712.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2708, pruned_loss=0.06412, over 4737297.63 frames. ], batch size: 212, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:06:30,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:06:30,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:06:30,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:06:32,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 14:06:32,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 14:06:32,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 14:06:34,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:06:34,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 14:06:35,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:37,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:06:38,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:40,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:44,645 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.38 vs. limit=22.5 2023-09-29 14:06:45,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:06:45,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 14:06:47,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:47,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:06:48,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:06:48,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:06:48,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:06:50,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:06:50,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:06:52,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 14:06:53,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:06:54,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:06:55,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:06:58,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 14:06:58,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:06:59,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=388540.0, ans=0.07 2023-09-29 14:07:02,223 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=388606.6666666667, ans=0.09899494936611666 2023-09-29 14:07:05,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:07:05,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 14:07:11,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=388606.6666666667, ans=0.125 2023-09-29 14:07:12,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:07:17,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:19,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:22,764 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.46 vs. limit=15.0 2023-09-29 14:07:23,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:23,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:25,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 14:07:25,957 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.08 vs. limit=15.0 2023-09-29 14:07:26,213 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.05 vs. limit=15.0 2023-09-29 14:07:29,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:07:30,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:07:30,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:07:33,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:34,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:35,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 14:07:40,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:43,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:07:43,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=388740.0, ans=0.1 2023-09-29 14:07:45,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=388740.0, ans=0.0 2023-09-29 14:07:46,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:47,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:07:49,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:07:49,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:07:49,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:07:49,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:07:53,029 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.12 vs. limit=15.0 2023-09-29 14:07:53,712 INFO [train.py:1039] (0/4) Epoch 11, batch 5200, loss[loss=0.1864, simple_loss=0.2607, pruned_loss=0.05601, over 24337.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2723, pruned_loss=0.06557, over 4713576.04 frames. ], batch size: 61, lr: 9.24e-03, grad_scale: 16.0 2023-09-29 14:07:53,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:07:55,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:07:57,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:07:58,391 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.30 vs. limit=22.5 2023-09-29 14:08:02,217 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.017e+02 2.564e+02 3.234e+02 5.917e+02, threshold=5.129e+02, percent-clipped=10.0 2023-09-29 14:08:02,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 14:08:02,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:08:03,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:08,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:08,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:08:08,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:11,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 14:08:13,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:08:15,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:16,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 14:08:19,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:08:20,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:08:22,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 14:08:22,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 14:08:25,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 14:08:25,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:25,166 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 14:08:26,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:28,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:28,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:08:28,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 14:08:29,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:08:32,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:35,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 14:08:36,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 14:08:36,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 14:08:41,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 14:08:41,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:08:48,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:08:49,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:08:51,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 14:08:51,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:51,245 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=389006.6666666667, ans=0.125 2023-09-29 14:08:52,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:08:52,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:52,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:08:56,197 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.56 vs. limit=22.5 2023-09-29 14:08:57,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:08:58,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:09:01,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:09:02,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:02,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:09,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:10,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 14:09:12,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:09:12,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:09:14,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:14,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:09:15,414 INFO [train.py:1039] (0/4) Epoch 11, batch 5250, loss[loss=0.1865, simple_loss=0.2548, pruned_loss=0.05916, over 24295.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.2709, pruned_loss=0.06499, over 4715219.72 frames. ], batch size: 56, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:09:17,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:09:17,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:09:20,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:20,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:09:22,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:09:22,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=389140.0, ans=0.1 2023-09-29 14:09:27,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:29,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:09:31,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=389206.6666666667, ans=0.1 2023-09-29 14:09:32,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:09:34,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=389206.6666666667, ans=0.1 2023-09-29 14:09:34,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=389206.6666666667, ans=0.05 2023-09-29 14:09:35,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:09:37,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 14:09:37,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:39,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:55,175 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.57 vs. limit=12.0 2023-09-29 14:10:10,620 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.81 vs. limit=15.0 2023-09-29 14:10:13,489 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.44 vs. limit=15.0 2023-09-29 14:10:22,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=389406.6666666667, ans=0.125 2023-09-29 14:10:25,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=389406.6666666667, ans=0.04949747468305833 2023-09-29 14:10:25,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=389406.6666666667, ans=0.125 2023-09-29 14:10:27,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=389406.6666666667, ans=0.2 2023-09-29 14:10:28,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=389473.3333333333, ans=0.125 2023-09-29 14:10:29,795 INFO [train.py:1039] (0/4) Epoch 11, batch 5300, loss[loss=0.1888, simple_loss=0.2596, pruned_loss=0.059, over 18588.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2703, pruned_loss=0.06463, over 4714545.77 frames. ], batch size: 40, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:10:31,536 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:10:36,666 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 2.030e+02 2.219e+02 2.602e+02 3.750e+02, threshold=4.437e+02, percent-clipped=0.0 2023-09-29 14:10:44,902 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-11.pt 2023-09-29 14:10:51,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:10:51,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 14:10:51,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 14:10:51,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:52,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:52,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:52,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:52,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:52,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:10:52,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:52,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:10:53,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:10:53,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 14:10:53,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 14:10:53,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 14:10:53,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:10:53,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 14:10:53,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 14:10:53,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:54,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:54,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:55,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:55,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:10:55,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:55,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:55,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:55,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:55,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:55,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:10:56,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:56,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:10:57,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 14:10:57,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:57,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:57,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 14:10:57,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 14:10:57,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:10:57,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:10:57,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 14:10:58,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 14:10:58,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:10:59,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:10:59,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:59,821 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 14:10:59,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 14:10:59,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:11:00,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:11:00,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 14:11:00,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 14:11:00,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 14:11:00,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:11:03,754 INFO [train.py:1039] (0/4) Epoch 12, batch 0, loss[loss=0.2214, simple_loss=0.2973, pruned_loss=0.07276, over 23943.00 frames. ], tot_loss[loss=0.2214, simple_loss=0.2973, pruned_loss=0.07276, over 23943.00 frames. ], batch size: 86, lr: 8.84e-03, grad_scale: 32.0 2023-09-29 14:11:03,755 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 14:11:19,086 INFO [train.py:1071] (0/4) Epoch 12, validation: loss=0.305, simple_loss=0.2807, pruned_loss=0.1647, over 1125622.00 frames. 2023-09-29 14:11:19,087 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 14:11:23,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 14:11:24,656 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.08 vs. limit=22.5 2023-09-29 14:11:25,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:11:26,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:11:30,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:30,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:11:30,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:31,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 14:11:33,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 14:11:34,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:36,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:40,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:40,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:40,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:11:40,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:41,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 14:11:43,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:51,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:11:51,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:54,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 14:11:57,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:11:57,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:12:00,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:00,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=389686.6666666667, ans=0.125 2023-09-29 14:12:05,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:12:08,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:15,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 14:12:18,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 14:12:18,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:18,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:18,893 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.93 vs. limit=15.0 2023-09-29 14:12:19,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:12:19,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:12:22,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 14:12:24,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:27,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:31,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:12:33,340 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 14:12:36,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:12:39,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:40,547 INFO [train.py:1039] (0/4) Epoch 12, batch 50, loss[loss=0.1919, simple_loss=0.279, pruned_loss=0.05243, over 24639.00 frames. ], tot_loss[loss=0.2045, simple_loss=0.2765, pruned_loss=0.06621, over 1071477.96 frames. ], batch size: 73, lr: 8.84e-03, grad_scale: 16.0 2023-09-29 14:12:42,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:42,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 14:12:42,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:12:42,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:12:44,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:46,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:47,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:49,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=389886.6666666667, ans=0.0 2023-09-29 14:12:52,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 14:12:52,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:52,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=389886.6666666667, ans=0.125 2023-09-29 14:12:57,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:12:58,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 14:13:01,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 14:13:02,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:13:04,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:04,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:05,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:07,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:13:07,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:13:07,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:13,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:15,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:16,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:13:17,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 14:13:20,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:13:20,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:13:20,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 14:13:22,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:23,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 14:13:32,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=390086.6666666667, ans=0.0 2023-09-29 14:13:33,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:13:33,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:35,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:37,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:37,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:40,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 14:13:40,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 14:13:42,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:42,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:43,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:43,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:45,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 14:13:45,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 14:13:48,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:13:49,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:49,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:13:49,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 14:13:49,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 14:13:51,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:53,162 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.076e+02 2.460e+02 3.514e+02 7.647e+02, threshold=4.919e+02, percent-clipped=15.0 2023-09-29 14:13:53,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:54,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:13:54,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:13:58,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:14:01,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:14:01,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=390220.0, ans=0.07 2023-09-29 14:14:02,838 INFO [train.py:1039] (0/4) Epoch 12, batch 100, loss[loss=0.1689, simple_loss=0.2441, pruned_loss=0.04689, over 21287.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2741, pruned_loss=0.06436, over 1884969.49 frames. ], batch size: 46, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:14:04,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:06,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 14:14:06,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:14:08,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=390220.0, ans=0.125 2023-09-29 14:14:10,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:14:11,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:11,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:14:11,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:14:11,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:15,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 14:14:16,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:14:17,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=390220.0, ans=0.2 2023-09-29 14:14:18,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:18,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:18,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:23,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 14:14:24,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:26,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:26,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:14:28,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:14:31,508 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 14:14:31,533 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 14:14:34,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:14:34,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:14:36,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:14:39,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:40,433 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.26 vs. limit=12.0 2023-09-29 14:14:41,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:49,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:51,016 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 14:14:52,073 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=390420.0, ans=0.025 2023-09-29 14:14:53,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:14:57,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:14:57,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:00,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:02,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:07,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:08,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:15:10,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:12,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:13,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:13,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:15:13,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:13,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 14:15:13,765 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 14:15:13,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:15,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:15:15,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:15,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:17,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 14:15:17,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:15:17,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:15:17,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:18,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:20,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:22,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:15:22,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:15:25,205 INFO [train.py:1039] (0/4) Epoch 12, batch 150, loss[loss=0.1748, simple_loss=0.2488, pruned_loss=0.05045, over 24591.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2735, pruned_loss=0.06457, over 2515897.77 frames. ], batch size: 60, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:15:25,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:30,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:30,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:15:30,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:34,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:35,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:36,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:15:38,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:43,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 14:15:43,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 14:15:43,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 14:15:46,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:15:46,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:15:48,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:49,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:49,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:49,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:49,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:51,630 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 14:15:54,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:00,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:00,893 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.65 vs. limit=15.0 2023-09-29 14:16:05,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:16:05,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=390686.6666666667, ans=0.1 2023-09-29 14:16:06,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 14:16:09,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:16:09,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:09,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:12,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:16:14,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:16:15,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:16:16,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=390753.3333333333, ans=0.125 2023-09-29 14:16:18,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:19,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 14:16:19,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=390753.3333333333, ans=0.125 2023-09-29 14:16:24,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:25,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:25,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:16:25,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:16:29,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:30,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 14:16:34,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:16:34,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:16:36,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:37,586 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.933e+02 2.162e+02 2.482e+02 3.211e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-29 14:16:39,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:16:41,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 14:16:41,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:41,282 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 14:16:44,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:47,329 INFO [train.py:1039] (0/4) Epoch 12, batch 200, loss[loss=0.1705, simple_loss=0.2497, pruned_loss=0.04562, over 24301.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.273, pruned_loss=0.06495, over 3017569.39 frames. ], batch size: 61, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:16:50,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:16:50,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:16:52,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 14:16:53,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:53,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:57,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 14:16:57,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:16:58,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:00,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:00,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=390886.6666666667, ans=0.125 2023-09-29 14:17:03,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.28 vs. limit=15.0 2023-09-29 14:17:04,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:17:04,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:17:04,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:05,997 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:17:08,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=390953.3333333333, ans=0.125 2023-09-29 14:17:17,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=390953.3333333333, ans=0.0 2023-09-29 14:17:25,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:17:26,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:17:26,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:17:28,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:17:28,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:17:30,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:17:32,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:33,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:17:34,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:17:35,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:17:37,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 14:17:39,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:17:39,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:44,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:17:50,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:17:51,195 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=391153.3333333333, ans=0.125 2023-09-29 14:18:00,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:00,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:18:04,544 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.81 vs. limit=15.0 2023-09-29 14:18:07,854 INFO [train.py:1039] (0/4) Epoch 12, batch 250, loss[loss=0.1872, simple_loss=0.2702, pruned_loss=0.05204, over 24347.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.2723, pruned_loss=0.06456, over 3404360.19 frames. ], batch size: 77, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:18:07,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:10,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 14:18:11,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:11,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:18:11,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:11,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:18:13,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 14:18:13,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:18:15,569 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 14:18:15,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:17,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:18:18,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:20,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:21,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:18:21,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:24,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:18:30,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:18:40,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:18:42,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:43,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:18:49,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:18:50,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:18:50,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:18:52,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:52,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:18:52,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:18:53,428 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=15.0 2023-09-29 14:18:54,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:57,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:19:00,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 14:19:00,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:19:04,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:19:04,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:19:04,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:19:06,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:06,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:19:06,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:19:09,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:09,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:19:10,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:11,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=391420.0, ans=0.2 2023-09-29 14:19:12,585 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=391420.0, ans=0.0 2023-09-29 14:19:14,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:19:14,623 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=391486.6666666667, ans=0.2 2023-09-29 14:19:17,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:19,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:19:22,364 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.950e+02 2.182e+02 2.665e+02 5.527e+02, threshold=4.363e+02, percent-clipped=2.0 2023-09-29 14:19:25,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:19:30,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 14:19:30,369 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=391553.3333333333, ans=0.125 2023-09-29 14:19:31,301 INFO [train.py:1039] (0/4) Epoch 12, batch 300, loss[loss=0.1906, simple_loss=0.2719, pruned_loss=0.05466, over 24298.00 frames. ], tot_loss[loss=0.1992, simple_loss=0.2701, pruned_loss=0.06414, over 3695245.50 frames. ], batch size: 74, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:19:31,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:19:33,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:34,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 14:19:34,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:19:35,519 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.20 vs. limit=15.0 2023-09-29 14:19:36,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:19:36,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 14:19:36,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=391553.3333333333, ans=0.125 2023-09-29 14:19:40,675 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.239e-03 2023-09-29 14:19:41,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:43,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:19:46,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:19:48,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 14:19:49,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:51,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:19:51,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 14:19:51,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:19:51,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=391620.0, ans=0.015 2023-09-29 14:19:56,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:19:59,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:20:01,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 14:20:02,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 14:20:04,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:04,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:09,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:09,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 14:20:09,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:20:13,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:20:14,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:20:14,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:19,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:20:19,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 14:20:21,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:20:24,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:26,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 14:20:27,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:31,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=391753.3333333333, ans=0.125 2023-09-29 14:20:32,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:20:34,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:20:34,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 14:20:38,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:38,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:20:40,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:42,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:20:43,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 14:20:44,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:20:44,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:44,268 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=391820.0, ans=0.0 2023-09-29 14:20:45,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 14:20:47,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:47,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:49,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:50,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:50,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:54,292 INFO [train.py:1039] (0/4) Epoch 12, batch 350, loss[loss=0.1822, simple_loss=0.2204, pruned_loss=0.07195, over 18839.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.2678, pruned_loss=0.0633, over 3919016.97 frames. ], batch size: 388, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:20:55,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:20:55,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:20:59,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:02,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=391886.6666666667, ans=0.125 2023-09-29 14:21:07,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:21:10,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:10,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:12,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 14:21:13,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:15,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 14:21:17,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:17,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 14:21:18,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:22,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 14:21:24,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:21:27,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:29,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:21:29,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:29,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:30,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:21:30,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:30,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:21:32,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:21:32,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:40,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:21:40,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:21:40,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:21:42,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:46,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 14:21:46,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:47,829 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.29 vs. limit=15.0 2023-09-29 14:21:52,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:52,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:21:53,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:55,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 14:21:57,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:21:57,140 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 14:22:00,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 14:22:00,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:05,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:22:05,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 14:22:06,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=392153.3333333333, ans=0.0 2023-09-29 14:22:07,471 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.933e+02 2.139e+02 2.473e+02 3.749e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-29 14:22:07,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:09,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:22:10,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:10,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:10,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:13,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:14,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=392153.3333333333, ans=0.0 2023-09-29 14:22:15,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=392220.0, ans=0.125 2023-09-29 14:22:16,764 INFO [train.py:1039] (0/4) Epoch 12, batch 400, loss[loss=0.196, simple_loss=0.2548, pruned_loss=0.06856, over 22743.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2677, pruned_loss=0.06306, over 4094963.12 frames. ], batch size: 322, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:22:16,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:22:18,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:22:20,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 14:22:20,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:20,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:23,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:22:23,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:26,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:29,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:31,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 14:22:32,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 14:22:32,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:37,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 14:22:37,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:40,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:22:40,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:40,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 14:22:41,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:22:41,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:41,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:43,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:44,991 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 14:22:46,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 14:22:51,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:51,747 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.33 vs. limit=22.5 2023-09-29 14:22:52,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:54,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 14:22:55,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 14:22:57,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:22:57,960 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.45 vs. limit=15.0 2023-09-29 14:23:00,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:08,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 14:23:12,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:23:13,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 14:23:15,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:23:18,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:23:18,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 14:23:21,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:23:24,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:23:26,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:23:27,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:27,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 14:23:28,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=392486.6666666667, ans=0.125 2023-09-29 14:23:29,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:23:30,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 14:23:34,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:23:34,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:23:36,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 14:23:37,534 INFO [train.py:1039] (0/4) Epoch 12, batch 450, loss[loss=0.2214, simple_loss=0.2813, pruned_loss=0.08073, over 23867.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2687, pruned_loss=0.06332, over 4244115.96 frames. ], batch size: 212, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:23:39,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:23:39,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:23:39,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:23:42,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 14:23:42,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:23:44,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:23:46,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:23:46,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 14:23:47,200 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.67 vs. limit=22.5 2023-09-29 14:23:47,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:23:48,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:23:51,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:24:00,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:00,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:02,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 14:24:03,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 14:24:08,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:24:11,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:13,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:17,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:17,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:17,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=392686.6666666667, ans=0.125 2023-09-29 14:24:20,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 14:24:22,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 14:24:23,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 14:24:24,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:24:25,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:25,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:24:26,323 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.30 vs. limit=15.0 2023-09-29 14:24:27,087 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 14:24:27,104 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 14:24:28,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:30,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:24:31,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:24:34,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:24:34,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:24:36,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:24:36,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 14:24:39,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:40,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:24:40,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:24:42,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 14:24:48,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:24:48,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 14:24:50,217 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.154e+02 2.453e+02 3.354e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 14:24:50,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 14:24:52,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:58,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:24:59,746 INFO [train.py:1039] (0/4) Epoch 12, batch 500, loss[loss=0.2188, simple_loss=0.287, pruned_loss=0.07534, over 23246.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2695, pruned_loss=0.06375, over 4345124.29 frames. ], batch size: 105, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:24:59,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:01,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:25:02,770 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 14:25:05,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=392886.6666666667, ans=0.0 2023-09-29 14:25:07,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:07,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:25:08,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:08,834 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 14:25:10,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 14:25:10,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:13,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:25:18,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:25:20,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:25:20,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:20,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:21,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:22,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=392953.3333333333, ans=0.125 2023-09-29 14:25:32,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=393020.0, ans=0.0 2023-09-29 14:25:33,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:33,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:25:34,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:25:34,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:34,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 14:25:34,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:25:38,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:25:38,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:25:39,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:25:39,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:39,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 14:25:43,896 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 14:25:45,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:25:47,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:49,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:25:50,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 14:25:55,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:25:56,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:01,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:04,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:26:04,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=393153.3333333333, ans=0.1 2023-09-29 14:26:09,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:12,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 14:26:12,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:12,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:15,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 14:26:17,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:26:18,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:20,267 INFO [train.py:1039] (0/4) Epoch 12, batch 550, loss[loss=0.1856, simple_loss=0.2579, pruned_loss=0.0566, over 23223.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2707, pruned_loss=0.06469, over 4424584.48 frames. ], batch size: 105, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:26:21,053 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.45 vs. limit=15.0 2023-09-29 14:26:22,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 14:26:25,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 14:26:25,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:25,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 14:26:27,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:26:27,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:28,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:26:28,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:26:32,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:33,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 14:26:34,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:26:36,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=393286.6666666667, ans=0.1 2023-09-29 14:26:39,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:26:39,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:42,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:26:44,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:49,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 14:26:50,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 14:26:52,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:26:56,459 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.93 vs. limit=22.5 2023-09-29 14:26:56,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:26:56,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:26:59,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:26:59,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=393353.3333333333, ans=22.5 2023-09-29 14:27:03,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:03,520 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 14:27:03,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:27:05,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:27:06,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:27:08,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:27:08,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:27:10,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:10,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=393420.0, ans=0.125 2023-09-29 14:27:11,241 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.25 vs. limit=15.0 2023-09-29 14:27:11,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 14:27:12,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 14:27:13,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:13,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:27:13,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:27:13,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:27:17,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:27:19,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:27:21,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:27:22,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:22,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 14:27:24,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:27:25,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:27,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:27:27,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=393486.6666666667, ans=0.125 2023-09-29 14:27:29,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:29,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:27:31,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:27:34,364 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.077e+02 2.338e+02 2.752e+02 4.124e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 14:27:37,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 14:27:41,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 14:27:43,115 INFO [train.py:1039] (0/4) Epoch 12, batch 600, loss[loss=0.1911, simple_loss=0.2499, pruned_loss=0.06616, over 23591.00 frames. ], tot_loss[loss=0.2004, simple_loss=0.271, pruned_loss=0.06489, over 4494665.24 frames. ], batch size: 256, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:27:43,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:27:44,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:27:44,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:45,722 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.62 vs. limit=22.5 2023-09-29 14:27:48,765 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.59 vs. limit=15.0 2023-09-29 14:27:51,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:27:54,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:27:55,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 14:27:58,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:27:58,969 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:28:00,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:01,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:05,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 14:28:05,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:28:10,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 14:28:12,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=393620.0, ans=0.1 2023-09-29 14:28:13,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:28:13,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:14,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:28:20,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:28:20,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:28:22,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:28,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:28:32,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:32,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:33,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:40,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=393753.3333333333, ans=0.125 2023-09-29 14:28:41,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 14:28:46,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:28:46,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:28:52,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 14:28:53,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:28:56,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 14:28:56,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:28:58,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:28:58,918 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-09-29 14:29:03,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:29:04,487 INFO [train.py:1039] (0/4) Epoch 12, batch 650, loss[loss=0.1907, simple_loss=0.27, pruned_loss=0.05568, over 24636.00 frames. ], tot_loss[loss=0.1987, simple_loss=0.2691, pruned_loss=0.0642, over 4539259.74 frames. ], batch size: 68, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:29:04,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:29:07,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:08,436 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=12.28 vs. limit=15.0 2023-09-29 14:29:10,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:29:12,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:15,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 14:29:16,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:29:22,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:29:22,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:26,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:29,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 14:29:31,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:29:32,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:35,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:29:35,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:29:39,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:39,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:40,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:29:41,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:42,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:29:44,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:29:45,581 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 14:29:45,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:45,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:29:46,107 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.533e-03 2023-09-29 14:29:49,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:49,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:49,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:29:51,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:29:51,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 14:29:53,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:29:53,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:55,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:29:57,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:58,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:30:00,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 14:30:02,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 14:30:02,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:02,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:30:03,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:30:03,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:30:05,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:30:12,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:12,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:14,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:30:17,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:17,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:30:17,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:20,385 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.997e+02 2.199e+02 2.485e+02 3.515e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 14:30:26,960 INFO [train.py:1039] (0/4) Epoch 12, batch 700, loss[loss=0.1803, simple_loss=0.2593, pruned_loss=0.05064, over 24625.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2673, pruned_loss=0.06344, over 4571247.87 frames. ], batch size: 60, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:30:27,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:30:27,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:27,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:27,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:30,771 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=394220.0, ans=0.125 2023-09-29 14:30:32,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 14:30:34,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 14:30:36,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 14:30:37,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:38,386 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=11.46 vs. limit=15.0 2023-09-29 14:30:39,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:30:42,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 14:30:45,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=394286.6666666667, ans=0.0 2023-09-29 14:30:46,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:49,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:30:51,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:53,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:30:53,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:57,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:58,154 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=394353.3333333333, ans=0.04949747468305833 2023-09-29 14:30:59,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:30:59,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:31:03,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 14:31:03,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=394353.3333333333, ans=0.125 2023-09-29 14:31:05,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 14:31:08,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:31:10,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:31:12,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:31:15,974 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.88 vs. limit=15.0 2023-09-29 14:31:16,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:31:18,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 14:31:21,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:21,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:31:21,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 14:31:26,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:31:26,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:26,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=394420.0, ans=0.125 2023-09-29 14:31:29,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:31:36,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:31:36,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 14:31:37,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=394486.6666666667, ans=0.1 2023-09-29 14:31:40,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 14:31:41,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 14:31:44,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:46,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:31:46,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:31:48,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:48,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 14:31:49,974 INFO [train.py:1039] (0/4) Epoch 12, batch 750, loss[loss=0.2094, simple_loss=0.2879, pruned_loss=0.06542, over 24030.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2669, pruned_loss=0.06259, over 4606882.14 frames. ], batch size: 80, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:31:52,123 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=394553.3333333333, ans=0.125 2023-09-29 14:31:52,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=394553.3333333333, ans=0.0 2023-09-29 14:31:53,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 14:31:53,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 14:31:53,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 14:31:54,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 14:31:56,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 14:31:56,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:31:56,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 14:31:57,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:59,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:31:59,873 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.81 vs. limit=22.5 2023-09-29 14:32:00,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:02,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:03,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:32:05,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:08,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:32:10,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:32:11,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:32:13,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:13,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:15,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 14:32:17,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:32:18,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:20,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:21,745 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.91 vs. limit=15.0 2023-09-29 14:32:23,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:32:25,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 14:32:25,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:32:25,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 14:32:25,646 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 14:32:25,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=394686.6666666667, ans=0.125 2023-09-29 14:32:27,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 14:32:27,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:32:27,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:32:30,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:32:37,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:32:37,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:37,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:32:40,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:40,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=394753.3333333333, ans=0.035 2023-09-29 14:32:41,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:41,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 14:32:43,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:32:45,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:32:47,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:32:48,090 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.00 vs. limit=10.0 2023-09-29 14:32:49,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:32:50,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 14:32:50,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:50,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=394753.3333333333, ans=0.125 2023-09-29 14:32:51,380 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.17 vs. limit=15.0 2023-09-29 14:32:55,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:32:57,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:32:57,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:00,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:33:02,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 14:33:02,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:03,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:06,840 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.038e+02 2.318e+02 2.858e+02 4.234e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 14:33:09,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:09,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:33:13,486 INFO [train.py:1039] (0/4) Epoch 12, batch 800, loss[loss=0.2165, simple_loss=0.2773, pruned_loss=0.07788, over 22782.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2681, pruned_loss=0.06327, over 4634361.09 frames. ], batch size: 322, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:33:22,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:22,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:22,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=394886.6666666667, ans=0.07 2023-09-29 14:33:24,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:24,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:25,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:26,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:27,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:30,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=394953.3333333333, ans=0.125 2023-09-29 14:33:32,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:32,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:33:34,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 14:33:35,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:36,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:36,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:33:37,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:38,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 14:33:38,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:38,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 14:33:43,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:45,510 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.14 vs. limit=22.5 2023-09-29 14:33:46,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:46,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:48,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:52,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:52,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:57,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:33:57,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:33:58,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 14:34:00,342 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 14:34:00,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 14:34:00,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:34:00,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:03,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:03,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:07,261 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 14:34:08,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 14:34:08,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:34:09,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=395086.6666666667, ans=0.0 2023-09-29 14:34:10,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=395086.6666666667, ans=0.0 2023-09-29 14:34:11,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:34:15,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:34:18,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:34:21,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 14:34:21,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:34:25,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 14:34:32,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=395153.3333333333, ans=0.09899494936611666 2023-09-29 14:34:33,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:36,927 INFO [train.py:1039] (0/4) Epoch 12, batch 850, loss[loss=0.1688, simple_loss=0.2433, pruned_loss=0.04711, over 24586.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2688, pruned_loss=0.06348, over 4660302.84 frames. ], batch size: 60, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:34:37,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:34:37,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 14:34:37,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:34:37,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:40,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 14:34:40,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:41,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:34:42,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:45,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:34:47,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:48,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 14:34:48,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 14:34:48,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 14:34:50,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:50,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:34:53,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:53,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:53,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:35:00,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:00,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:00,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 14:35:06,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 14:35:09,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:10,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 14:35:14,382 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.03 vs. limit=6.0 2023-09-29 14:35:15,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 14:35:15,546 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=395353.3333333333, ans=0.07 2023-09-29 14:35:16,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 14:35:18,881 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 14:35:18,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:18,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:35:20,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:35:21,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:23,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:24,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 14:35:25,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=395420.0, ans=0.125 2023-09-29 14:35:26,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:26,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:28,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:35:28,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:35:29,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=395420.0, ans=0.0 2023-09-29 14:35:29,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=395420.0, ans=0.125 2023-09-29 14:35:30,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:35:32,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:35:32,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 14:35:37,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:35:37,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:37,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:35:37,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:40,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:43,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:44,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:35:46,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:35:48,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:35:48,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:35:51,266 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.55 vs. limit=12.0 2023-09-29 14:35:53,427 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.869e+02 2.098e+02 2.387e+02 5.753e+02, threshold=4.196e+02, percent-clipped=1.0 2023-09-29 14:35:53,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:35:56,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:56,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 14:35:57,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:35:57,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:59,362 INFO [train.py:1039] (0/4) Epoch 12, batch 900, loss[loss=0.1865, simple_loss=0.2591, pruned_loss=0.05688, over 24618.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.2699, pruned_loss=0.06409, over 4674598.16 frames. ], batch size: 60, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:36:00,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 14:36:01,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=395553.3333333333, ans=0.0 2023-09-29 14:36:07,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:36:10,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:11,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 14:36:13,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:36:13,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=395553.3333333333, ans=0.1 2023-09-29 14:36:14,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 14:36:15,780 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.00 vs. limit=15.0 2023-09-29 14:36:16,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:36:16,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:36:16,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:16,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:36:17,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:36:21,941 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.93 vs. limit=15.0 2023-09-29 14:36:27,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:36:27,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:27,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:36:31,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:36,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 14:36:40,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:36:44,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:36:46,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:36:46,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=395686.6666666667, ans=0.035 2023-09-29 14:36:47,936 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 14:36:49,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 14:36:54,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:36:55,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:36:55,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:37:00,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:00,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:03,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 14:37:04,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:37:07,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 14:37:10,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:37:10,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:10,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:37:11,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:15,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 14:37:15,766 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 14:37:17,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:37:19,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 14:37:20,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:22,854 INFO [train.py:1039] (0/4) Epoch 12, batch 950, loss[loss=0.2011, simple_loss=0.2816, pruned_loss=0.06034, over 24338.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2693, pruned_loss=0.06373, over 4694060.83 frames. ], batch size: 77, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:37:24,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 14:37:29,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:32,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:37:35,475 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 14:37:39,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:39,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:39,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:40,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:37:40,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 14:37:42,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:37:46,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:48,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 14:37:49,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:53,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:53,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:53,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:54,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 14:37:54,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=396020.0, ans=0.125 2023-09-29 14:37:55,502 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.64 vs. limit=15.0 2023-09-29 14:37:56,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:37:58,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:59,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:38:04,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:04,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:38:06,780 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.18 vs. limit=12.0 2023-09-29 14:38:09,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 14:38:09,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=396020.0, ans=0.0 2023-09-29 14:38:12,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:38:12,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:38:12,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:14,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:14,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:38:19,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 14:38:19,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:38:22,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:22,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:22,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=396086.6666666667, ans=0.0 2023-09-29 14:38:24,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 14:38:24,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:24,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:38:26,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 14:38:26,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=396086.6666666667, ans=0.0 2023-09-29 14:38:32,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:38:34,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:38,827 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.985e+02 2.175e+02 2.387e+02 3.582e+02, threshold=4.351e+02, percent-clipped=0.0 2023-09-29 14:38:39,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:38:42,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 14:38:42,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 14:38:44,953 INFO [train.py:1039] (0/4) Epoch 12, batch 1000, loss[loss=0.2106, simple_loss=0.2658, pruned_loss=0.07765, over 23762.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2688, pruned_loss=0.06332, over 4706676.88 frames. ], batch size: 164, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:38:45,331 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=396220.0, ans=0.0 2023-09-29 14:38:46,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:48,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 14:38:49,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:55,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:38:57,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 14:38:57,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 14:39:00,655 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=396286.6666666667, ans=0.125 2023-09-29 14:39:04,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:04,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:39:04,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:09,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 14:39:10,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 14:39:13,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 14:39:15,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:15,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 14:39:16,326 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.81 vs. limit=6.0 2023-09-29 14:39:17,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 14:39:17,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 14:39:17,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:18,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:28,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:30,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:39:31,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:32,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:32,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 14:39:32,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:33,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:39:34,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:35,650 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 14:39:40,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 14:39:40,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 14:39:41,470 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.07 vs. limit=15.0 2023-09-29 14:39:42,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 14:39:44,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:39:52,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:39:52,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:39:52,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=396486.6666666667, ans=0.1 2023-09-29 14:39:52,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=396486.6666666667, ans=0.2 2023-09-29 14:39:55,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 14:39:55,960 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.19 vs. limit=6.0 2023-09-29 14:39:56,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:39:57,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 14:39:58,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 14:40:00,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:00,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:40:02,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:40:04,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:40:07,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:08,003 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=396553.3333333333, ans=0.0 2023-09-29 14:40:08,957 INFO [train.py:1039] (0/4) Epoch 12, batch 1050, loss[loss=0.1672, simple_loss=0.2124, pruned_loss=0.06097, over 19075.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2672, pruned_loss=0.06308, over 4690865.20 frames. ], batch size: 388, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:40:10,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:40:12,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:40:13,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:40:14,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=396553.3333333333, ans=0.0 2023-09-29 14:40:15,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:18,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:21,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:40:23,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:40:24,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:40:26,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:40:26,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:40:28,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:40:29,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 14:40:29,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:31,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 14:40:34,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:34,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 14:40:34,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:40:43,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:44,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:40:44,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:46,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 14:40:47,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 14:40:47,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:51,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 14:40:55,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 14:40:55,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:58,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:41:00,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:41:00,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:00,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:41:05,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:41:06,791 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.80 vs. limit=22.5 2023-09-29 14:41:08,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 14:41:09,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 14:41:10,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 14:41:10,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:10,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:41:14,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 14:41:17,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:41:20,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:20,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:41:22,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:22,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:24,159 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.921e+02 2.123e+02 2.375e+02 5.047e+02, threshold=4.247e+02, percent-clipped=1.0 2023-09-29 14:41:26,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:26,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 14:41:28,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:28,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 14:41:29,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 14:41:29,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:41:30,457 INFO [train.py:1039] (0/4) Epoch 12, batch 1100, loss[loss=0.1842, simple_loss=0.2668, pruned_loss=0.05076, over 24288.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2671, pruned_loss=0.06249, over 4704208.86 frames. ], batch size: 61, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:41:33,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:41:34,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=396886.6666666667, ans=0.0 2023-09-29 14:41:38,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:41:40,174 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=396886.6666666667, ans=0.07 2023-09-29 14:41:43,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:41:45,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:41:46,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:46,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 14:41:48,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:50,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:41:54,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:41:57,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:41:57,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 14:41:59,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:41:59,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:59,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:41:59,882 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.04 vs. limit=10.0 2023-09-29 14:42:02,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:42:04,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:42:08,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:42:11,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 14:42:11,945 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 14:42:13,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:16,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:17,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:42:17,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:42:19,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 14:42:21,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:42:21,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:42:21,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:42:23,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:23,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 14:42:25,691 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-09-29 14:42:30,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:42:31,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 14:42:33,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:42:35,169 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=397153.3333333333, ans=0.125 2023-09-29 14:42:37,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:42:41,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 14:42:41,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:42:41,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:41,662 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=397153.3333333333, ans=0.0 2023-09-29 14:42:43,733 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.91 vs. limit=15.0 2023-09-29 14:42:44,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:42:44,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:44,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 14:42:44,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:42:44,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:46,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 14:42:46,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:42:47,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 14:42:48,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:42:48,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:42:49,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:42:51,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=397220.0, ans=0.2 2023-09-29 14:42:52,848 INFO [train.py:1039] (0/4) Epoch 12, batch 1150, loss[loss=0.1813, simple_loss=0.2541, pruned_loss=0.05427, over 24584.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2684, pruned_loss=0.06272, over 4719914.38 frames. ], batch size: 60, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:42:57,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:01,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:43:03,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:03,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:43:03,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 14:43:03,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:06,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 14:43:08,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:08,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:43:09,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=397286.6666666667, ans=0.125 2023-09-29 14:43:14,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 14:43:15,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:20,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:20,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=397286.6666666667, ans=0.2 2023-09-29 14:43:21,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:21,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 14:43:21,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:43:21,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:25,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 14:43:27,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:29,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:30,144 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.74 vs. limit=15.0 2023-09-29 14:43:31,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=397353.3333333333, ans=0.125 2023-09-29 14:43:38,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=397353.3333333333, ans=0.0 2023-09-29 14:43:39,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:45,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:45,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 14:43:47,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:47,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:49,063 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:43:50,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=397420.0, ans=0.125 2023-09-29 14:43:54,857 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 14:43:56,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:44:00,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=397486.6666666667, ans=0.1 2023-09-29 14:44:03,180 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 14:44:04,165 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=397486.6666666667, ans=0.125 2023-09-29 14:44:08,541 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.949e+02 2.168e+02 2.522e+02 3.297e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 14:44:08,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:08,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:44:08,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:44:10,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:44:14,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:15,309 INFO [train.py:1039] (0/4) Epoch 12, batch 1200, loss[loss=0.2246, simple_loss=0.275, pruned_loss=0.08711, over 23775.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.269, pruned_loss=0.06294, over 4710766.44 frames. ], batch size: 150, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:44:19,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:44:19,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:44:21,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:21,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:21,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=397553.3333333333, ans=0.0 2023-09-29 14:44:23,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:44:24,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:44:26,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:44:26,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=397553.3333333333, ans=0.1 2023-09-29 14:44:29,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:29,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:44:29,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=397620.0, ans=0.125 2023-09-29 14:44:31,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=397620.0, ans=0.0 2023-09-29 14:44:32,245 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 14:44:32,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=397620.0, ans=0.0 2023-09-29 14:44:34,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=397620.0, ans=0.0 2023-09-29 14:44:35,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 14:44:39,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:44:42,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:44:44,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:46,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:44:46,544 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 14:44:48,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:54,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:44:54,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:44:55,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 14:44:55,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:45:00,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 14:45:03,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 14:45:05,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:45:05,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:45:06,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:06,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:45:08,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:45:09,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:45:10,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:45:11,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 14:45:11,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:45:11,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:11,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:45:14,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:14,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:21,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:45:22,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:45:25,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 14:45:29,489 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=397820.0, ans=0.0 2023-09-29 14:45:30,663 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 14:45:30,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:45:33,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:35,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:45:36,953 INFO [train.py:1039] (0/4) Epoch 12, batch 1250, loss[loss=0.1836, simple_loss=0.2654, pruned_loss=0.05094, over 24455.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2712, pruned_loss=0.06392, over 4705014.51 frames. ], batch size: 69, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:45:37,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:40,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 14:45:46,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:45:48,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:48,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 14:45:51,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:45:51,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:45:57,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:45:58,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:58,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:45:58,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:45:59,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=397953.3333333333, ans=0.125 2023-09-29 14:46:01,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:46:05,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:46:05,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:05,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:06,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=397953.3333333333, ans=0.125 2023-09-29 14:46:07,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:46:07,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:11,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:12,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:46:12,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=398020.0, ans=0.0 2023-09-29 14:46:17,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 14:46:19,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:46:22,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:22,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 14:46:22,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:46:22,617 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 14:46:23,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:23,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:27,050 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.49 vs. limit=6.0 2023-09-29 14:46:29,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:31,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=398086.6666666667, ans=0.1 2023-09-29 14:46:32,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:32,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:46:34,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 14:46:34,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 14:46:34,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 14:46:39,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:46:39,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 14:46:39,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:42,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:46:42,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:46:43,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 14:46:45,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:46:45,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:46:46,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:46:46,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:48,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 14:46:50,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:50,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=398153.3333333333, ans=0.0 2023-09-29 14:46:52,056 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.018e+02 2.304e+02 2.594e+02 4.435e+02, threshold=4.607e+02, percent-clipped=1.0 2023-09-29 14:46:52,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:46:53,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:46:57,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:58,617 INFO [train.py:1039] (0/4) Epoch 12, batch 1300, loss[loss=0.1813, simple_loss=0.2672, pruned_loss=0.04766, over 24343.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2725, pruned_loss=0.06466, over 4702557.82 frames. ], batch size: 74, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:47:02,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:47:02,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 14:47:05,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:07,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:47:08,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:10,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:47:11,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:47:13,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 14:47:19,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:47:20,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:47:22,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 14:47:27,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:47:30,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:30,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:33,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:36,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:38,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:47:38,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:47:38,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 14:47:43,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:47:43,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:47:44,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 14:47:46,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:47:47,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:47:50,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:50,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 14:47:52,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:52,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 14:47:53,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:59,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:59,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:48:02,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 14:48:03,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 14:48:05,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 14:48:12,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:48:14,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 14:48:15,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:20,327 INFO [train.py:1039] (0/4) Epoch 12, batch 1350, loss[loss=0.1874, simple_loss=0.2372, pruned_loss=0.06884, over 22600.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2711, pruned_loss=0.06394, over 4706588.77 frames. ], batch size: 322, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:48:20,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 14:48:22,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=398553.3333333333, ans=0.0 2023-09-29 14:48:23,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:25,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:30,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:30,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:32,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:48:33,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:36,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:38,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 14:48:39,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:48:42,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:48:45,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 14:48:45,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:48:48,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:48:48,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 14:48:48,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 14:48:49,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 14:48:52,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:52,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 14:49:05,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:16,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:17,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:17,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 14:49:19,681 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=18.05 vs. limit=22.5 2023-09-29 14:49:20,886 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.37 vs. limit=22.5 2023-09-29 14:49:21,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:21,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 14:49:21,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:49:23,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:49:24,601 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.79 vs. limit=15.0 2023-09-29 14:49:25,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:49:27,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 14:49:28,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:49:33,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 14:49:35,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 14:49:36,889 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.916e+02 2.100e+02 2.366e+02 3.252e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-29 14:49:38,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=398820.0, ans=0.0 2023-09-29 14:49:42,872 INFO [train.py:1039] (0/4) Epoch 12, batch 1400, loss[loss=0.2319, simple_loss=0.3124, pruned_loss=0.0757, over 24566.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2699, pruned_loss=0.06387, over 4694796.18 frames. ], batch size: 71, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:49:43,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 14:49:44,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:48,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:49:49,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:49:55,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 14:49:57,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 14:50:03,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=398953.3333333333, ans=0.125 2023-09-29 14:50:07,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:50:08,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:11,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:50:11,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:50:16,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:50:18,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:50:27,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:30,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:34,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 14:50:34,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:50:34,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:50:36,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:50:37,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:39,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:50:39,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:50:39,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:50:41,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 14:50:41,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:50:47,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:50,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:50:56,193 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=399153.3333333333, ans=0.125 2023-09-29 14:50:57,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 14:50:58,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:51:00,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:51:02,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:51:02,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:05,591 INFO [train.py:1039] (0/4) Epoch 12, batch 1450, loss[loss=0.2046, simple_loss=0.2841, pruned_loss=0.06258, over 24360.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2699, pruned_loss=0.06339, over 4717317.32 frames. ], batch size: 77, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:51:05,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:51:09,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:51:12,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:51:12,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:12,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:51:16,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=399220.0, ans=0.125 2023-09-29 14:51:17,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:18,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:51:20,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:51:20,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 14:51:22,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:51:22,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 14:51:23,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:23,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:23,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 14:51:24,488 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.55 vs. limit=15.0 2023-09-29 14:51:27,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:28,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:51:29,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 14:51:29,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:29,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:51:30,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:33,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:39,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:51:39,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:51:42,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:43,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:44,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:45,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:51:45,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:46,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:51:49,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 14:51:53,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:56,345 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 14:51:57,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:51:59,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:52:01,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:02,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 14:52:07,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:09,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 14:52:10,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 14:52:12,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:14,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:14,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:52:17,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 14:52:19,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 14:52:20,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 14:52:21,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:22,749 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.049e+02 2.244e+02 2.638e+02 4.746e+02, threshold=4.488e+02, percent-clipped=1.0 2023-09-29 14:52:24,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:52:29,499 INFO [train.py:1039] (0/4) Epoch 12, batch 1500, loss[loss=0.2263, simple_loss=0.2867, pruned_loss=0.08301, over 22748.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2701, pruned_loss=0.06396, over 4711409.92 frames. ], batch size: 322, lr: 8.73e-03, grad_scale: 32.0 2023-09-29 14:52:37,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 14:52:37,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:52:37,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:52:39,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:40,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:40,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:52:42,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 14:52:42,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:52:43,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:52:43,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:43,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:45,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:52:47,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 14:52:54,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:52:56,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:52:57,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:53:02,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 14:53:05,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 14:53:07,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:53:07,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 14:53:10,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:53:13,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:13,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:53:13,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:53:15,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 14:53:16,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:53:16,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:16,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 14:53:17,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=399753.3333333333, ans=0.125 2023-09-29 14:53:18,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:23,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:53:23,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 14:53:23,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=399753.3333333333, ans=0.0 2023-09-29 14:53:27,012 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=399753.3333333333, ans=0.05 2023-09-29 14:53:28,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:53:30,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:53:34,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=399820.0, ans=0.0 2023-09-29 14:53:35,657 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 14:53:35,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:37,064 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 14:53:38,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:53:38,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:53:40,142 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 14:53:41,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:53:44,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 14:53:45,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:50,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:50,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:50,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:51,419 INFO [train.py:1039] (0/4) Epoch 12, batch 1550, loss[loss=0.2094, simple_loss=0.2896, pruned_loss=0.06458, over 23974.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.2699, pruned_loss=0.0636, over 4731243.08 frames. ], batch size: 80, lr: 8.73e-03, grad_scale: 16.0 2023-09-29 14:53:51,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:51,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:53,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 14:53:55,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 14:53:55,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:53:55,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=399886.6666666667, ans=0.1 2023-09-29 14:53:56,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 14:53:57,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 14:54:00,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:00,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:01,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:01,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:54:02,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=399886.6666666667, ans=0.2 2023-09-29 14:54:03,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:03,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:07,895 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 14:54:07,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:07,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:54:08,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=399953.3333333333, ans=0.0 2023-09-29 14:54:10,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:54:12,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:54:12,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 14:54:12,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=399953.3333333333, ans=0.125 2023-09-29 14:54:15,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:15,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 14:54:16,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 14:54:16,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 14:54:16,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:16,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:18,596 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-60000.pt 2023-09-29 14:54:22,226 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=399953.3333333333, ans=0.1 2023-09-29 14:54:26,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:54:28,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 14:54:28,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 14:54:28,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=400020.0, ans=0.125 2023-09-29 14:54:37,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:40,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:42,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:54:42,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:54:42,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 14:54:49,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:54:50,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:53,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:54:55,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:54:55,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:55,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 14:54:55,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=400086.6666666667, ans=0.125 2023-09-29 14:54:57,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:54:59,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:54:59,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:59,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 14:54:59,213 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 14:55:02,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:09,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 14:55:11,981 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.422e+02 2.047e+02 2.446e+02 2.941e+02 5.003e+02, threshold=4.892e+02, percent-clipped=3.0 2023-09-29 14:55:12,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:12,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=400153.3333333333, ans=0.0 2023-09-29 14:55:13,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:55:13,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 14:55:16,842 INFO [train.py:1039] (0/4) Epoch 12, batch 1600, loss[loss=0.1792, simple_loss=0.2609, pruned_loss=0.04877, over 24640.00 frames. ], tot_loss[loss=0.1987, simple_loss=0.2703, pruned_loss=0.06355, over 4718598.24 frames. ], batch size: 68, lr: 8.72e-03, grad_scale: 32.0 2023-09-29 14:55:16,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:55:18,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:18,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:55:18,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:55:19,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:55:24,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:24,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 14:55:26,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 14:55:29,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 14:55:31,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:55:33,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 14:55:34,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:55:35,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:55:41,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:55:44,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 14:55:48,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:55:49,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 14:55:50,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:50,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 14:55:52,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=400353.3333333333, ans=0.2 2023-09-29 14:55:57,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 14:55:58,122 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.55 vs. limit=8.0 2023-09-29 14:56:04,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:04,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 14:56:06,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:06,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:06,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:56:09,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 14:56:13,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 14:56:13,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:56:13,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:14,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:16,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:56:17,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:56:18,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:56:19,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:56:22,888 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=400486.6666666667, ans=0.125 2023-09-29 14:56:24,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=400486.6666666667, ans=0.2 2023-09-29 14:56:27,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:27,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:56:30,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 14:56:30,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:56:30,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 14:56:36,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:39,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:56:39,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:56:41,186 INFO [train.py:1039] (0/4) Epoch 12, batch 1650, loss[loss=0.2082, simple_loss=0.2816, pruned_loss=0.06742, over 23279.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.271, pruned_loss=0.06406, over 4708862.28 frames. ], batch size: 105, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:56:41,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 14:56:41,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 14:56:41,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 14:56:41,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 14:56:45,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=400553.3333333333, ans=0.125 2023-09-29 14:56:46,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:48,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:48,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:56:48,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:56:51,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:55,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 14:56:57,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:57,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:57,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:56:57,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:56:58,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 14:56:58,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 14:57:06,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:57:07,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:57:07,853 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.77 vs. limit=15.0 2023-09-29 14:57:16,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 14:57:17,755 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.36 vs. limit=15.0 2023-09-29 14:57:19,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:22,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 14:57:25,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:28,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:57:29,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:57:29,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:31,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:57:31,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:34,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:57:36,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:36,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:36,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:37,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:39,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:57:42,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:42,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=400753.3333333333, ans=0.0 2023-09-29 14:57:44,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 14:57:45,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:47,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 14:57:47,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 14:57:47,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 14:57:47,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:49,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:57:49,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:51,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:51,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 14:57:56,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:58,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:57:58,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:59,740 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.879e+02 2.089e+02 2.451e+02 3.406e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 14:58:00,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 14:58:03,035 INFO [train.py:1039] (0/4) Epoch 12, batch 1700, loss[loss=0.1851, simple_loss=0.2609, pruned_loss=0.0546, over 24505.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2698, pruned_loss=0.06319, over 4707974.41 frames. ], batch size: 63, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:58:04,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:58:04,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:58:05,398 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.49 vs. limit=15.0 2023-09-29 14:58:06,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 14:58:06,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:06,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:58:06,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:06,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=400886.6666666667, ans=0.0 2023-09-29 14:58:09,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:58:09,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=400886.6666666667, ans=0.125 2023-09-29 14:58:10,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:58:10,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 14:58:13,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:58:17,136 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.11 vs. limit=6.0 2023-09-29 14:58:18,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:21,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:58:29,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:58:29,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:58:29,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:29,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:58:33,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 14:58:35,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:58:35,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:35,693 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=401020.0, ans=0.125 2023-09-29 14:58:37,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:58:38,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:58:41,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 14:58:41,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 14:58:43,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:46,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 14:58:46,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:58:47,976 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.32 vs. limit=15.0 2023-09-29 14:58:48,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=401020.0, ans=0.2 2023-09-29 14:58:55,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:58:57,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:58:58,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:59:00,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:59:00,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 14:59:00,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:59:02,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 14:59:02,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:02,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:02,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:07,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:07,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:59:08,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:08,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:59:08,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:12,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:13,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 14:59:17,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:18,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:21,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 14:59:27,055 INFO [train.py:1039] (0/4) Epoch 12, batch 1750, loss[loss=0.1891, simple_loss=0.2758, pruned_loss=0.05117, over 24685.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2689, pruned_loss=0.06283, over 4717184.17 frames. ], batch size: 68, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 14:59:28,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:31,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:31,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:59:31,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=401220.0, ans=0.1 2023-09-29 14:59:32,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=401220.0, ans=0.125 2023-09-29 14:59:33,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 14:59:33,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:36,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:59:36,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:41,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 14:59:43,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:46,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 14:59:46,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:47,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:59:51,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 14:59:53,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 14:59:54,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:56,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 14:59:56,561 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=401286.6666666667, ans=0.125 2023-09-29 15:00:03,002 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=401353.3333333333, ans=0.0 2023-09-29 15:00:04,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:00:06,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:06,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:13,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:13,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:16,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:00:17,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:19,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:20,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:00:21,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 15:00:24,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:26,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 15:00:28,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:29,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:29,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:00:33,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:00:34,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 15:00:34,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:36,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=401486.6666666667, ans=0.2 2023-09-29 15:00:37,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:42,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:43,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:00:45,915 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.394e+02 1.915e+02 2.238e+02 2.656e+02 3.754e+02, threshold=4.475e+02, percent-clipped=0.0 2023-09-29 15:00:46,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:00:46,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 15:00:46,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:49,096 INFO [train.py:1039] (0/4) Epoch 12, batch 1800, loss[loss=0.2022, simple_loss=0.277, pruned_loss=0.06373, over 24425.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2683, pruned_loss=0.06184, over 4731676.22 frames. ], batch size: 77, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:00:49,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:00:49,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:00:49,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:00:49,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:00:51,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:00:54,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:00:55,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:58,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:00:59,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:02,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:01:04,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:01:07,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:11,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:11,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:12,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:01:12,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=401620.0, ans=0.1 2023-09-29 15:01:14,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:01:14,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 15:01:15,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:19,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:24,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 15:01:25,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 15:01:25,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 15:01:27,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:29,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:29,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:01:29,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:01:32,404 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.36 vs. limit=15.0 2023-09-29 15:01:39,674 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 15:01:39,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:01:41,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:42,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 15:01:42,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 15:01:44,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:01:46,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:01:46,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:01:50,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 15:01:51,632 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-09-29 15:01:52,653 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:01:56,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=401820.0, ans=0.0 2023-09-29 15:01:59,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:01:59,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 15:01:59,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:59,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:59,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:02:01,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 15:02:04,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:02:04,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:08,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 15:02:08,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:11,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:11,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:02:11,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:13,211 INFO [train.py:1039] (0/4) Epoch 12, batch 1850, loss[loss=0.1689, simple_loss=0.2375, pruned_loss=0.05019, over 24351.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2675, pruned_loss=0.06168, over 4729098.03 frames. ], batch size: 56, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:02:13,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:14,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:02:17,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:02:17,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:19,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:02:20,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:02:24,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=401886.6666666667, ans=0.0 2023-09-29 15:02:27,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:02:27,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 15:02:32,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 15:02:35,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 15:02:40,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:40,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 15:02:40,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:02:51,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:02:51,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 15:02:54,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:02:54,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:02:57,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 15:02:57,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:57,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:03:00,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:03:03,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:03:07,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:11,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:03:11,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:11,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:03:11,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:15,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:16,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:03:20,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=402153.3333333333, ans=0.0 2023-09-29 15:03:21,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 15:03:23,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:25,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:03:26,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:03:26,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 15:03:26,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 15:03:28,135 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 15:03:28,267 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 15:03:29,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:03:29,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:03:29,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:30,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:31,396 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 15:03:31,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:03:31,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:32,690 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.242e+02 2.724e+02 5.145e+02, threshold=4.485e+02, percent-clipped=1.0 2023-09-29 15:03:32,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:03:34,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:03:35,676 INFO [train.py:1039] (0/4) Epoch 12, batch 1900, loss[loss=0.1923, simple_loss=0.272, pruned_loss=0.05633, over 24495.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.2686, pruned_loss=0.06206, over 4731513.08 frames. ], batch size: 66, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:03:37,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:03:37,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 15:03:39,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=402220.0, ans=0.125 2023-09-29 15:03:40,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:40,403 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 15:03:42,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:03:43,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:48,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:50,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:03:50,977 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 15:03:53,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 15:03:54,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:54,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:54,805 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 15:03:56,700 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 15:03:58,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 15:04:01,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:04:03,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 15:04:05,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 15:04:15,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 15:04:17,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 15:04:17,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:04:19,110 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 15:04:19,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 15:04:19,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 15:04:19,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 15:04:19,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:04:21,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=402353.3333333333, ans=0.125 2023-09-29 15:04:24,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 15:04:29,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:04:33,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:33,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 15:04:33,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=402420.0, ans=0.2 2023-09-29 15:04:37,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:04:40,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 15:04:40,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:46,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:04:46,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:04:46,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:04:48,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:04:49,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:04:49,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:04:51,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:04:54,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:04:54,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:04:56,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:04:56,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:56,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:58,333 INFO [train.py:1039] (0/4) Epoch 12, batch 1950, loss[loss=0.1744, simple_loss=0.2533, pruned_loss=0.04782, over 24484.00 frames. ], tot_loss[loss=0.1978, simple_loss=0.2697, pruned_loss=0.06298, over 4729439.94 frames. ], batch size: 66, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:04:59,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:05:04,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:05,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:05:05,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:05,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:05:09,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=402553.3333333333, ans=0.0 2023-09-29 15:05:10,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 15:05:10,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:05:10,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:13,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:16,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:05:16,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:17,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:18,770 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-09-29 15:05:19,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:21,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:21,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:05:22,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:05:22,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:28,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:31,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:05:31,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:31,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:05:31,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 15:05:33,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:05:33,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:05:33,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:38,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:41,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:05:45,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:05:48,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:05:49,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:05:49,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 15:05:49,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:05:54,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:54,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:05:55,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:02,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:02,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:06,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:08,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:11,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:06:12,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:13,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 15:06:13,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:06:14,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:06:16,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 15:06:17,819 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.002e+02 2.294e+02 2.547e+02 3.463e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 15:06:19,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:20,836 INFO [train.py:1039] (0/4) Epoch 12, batch 2000, loss[loss=0.172, simple_loss=0.2616, pruned_loss=0.0412, over 24462.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2695, pruned_loss=0.06294, over 4724464.50 frames. ], batch size: 69, lr: 8.70e-03, grad_scale: 32.0 2023-09-29 15:06:23,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:24,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:06:25,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:06:25,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:06:27,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:31,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 15:06:32,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:06:34,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:06:35,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 15:06:38,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:06:38,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:41,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:06:42,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 15:06:44,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:46,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:48,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:48,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 15:06:48,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=402953.3333333333, ans=0.0 2023-09-29 15:06:49,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:06:51,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 15:06:51,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:06:54,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:06:55,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:06:55,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:57,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:00,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:01,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 15:07:03,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 15:07:03,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:07:03,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:03,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=403020.0, ans=0.125 2023-09-29 15:07:06,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:10,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:07:10,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:11,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:07:13,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:13,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:13,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:13,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:15,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:15,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=403086.6666666667, ans=0.0 2023-09-29 15:07:19,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:19,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 15:07:23,552 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.03 vs. limit=15.0 2023-09-29 15:07:24,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:07:24,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:07:32,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:33,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:07:33,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:07:37,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:38,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:42,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:44,186 INFO [train.py:1039] (0/4) Epoch 12, batch 2050, loss[loss=0.2076, simple_loss=0.2797, pruned_loss=0.0677, over 23481.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.2693, pruned_loss=0.06368, over 4712787.95 frames. ], batch size: 93, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:07:45,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:51,799 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.73 vs. limit=22.5 2023-09-29 15:07:52,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:55,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:07:56,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:57,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:00,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 15:08:00,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:08:01,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:02,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:08:10,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:10,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:13,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 15:08:13,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=403286.6666666667, ans=0.125 2023-09-29 15:08:14,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:18,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 15:08:18,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:20,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:22,477 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=22.5 2023-09-29 15:08:23,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:23,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:08:25,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:27,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:08:29,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:08:29,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:08:33,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:35,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:08:36,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:08:39,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:42,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:08:47,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:48,636 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.54 vs. limit=22.5 2023-09-29 15:08:49,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 15:08:55,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:08:56,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:08:59,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:09:01,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 15:09:04,764 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.918e+02 2.083e+02 2.421e+02 3.715e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-29 15:09:06,258 INFO [train.py:1039] (0/4) Epoch 12, batch 2100, loss[loss=0.2048, simple_loss=0.2775, pruned_loss=0.06611, over 24083.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2677, pruned_loss=0.063, over 4718034.42 frames. ], batch size: 80, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:09:06,476 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 15:09:06,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:07,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:07,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:09,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:09:09,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 15:09:09,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 15:09:12,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:09:15,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:09:15,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:09:18,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:20,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:09:20,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 15:09:20,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:09:21,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 15:09:21,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 15:09:23,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:23,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:09:23,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 15:09:23,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:09:26,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=403620.0, ans=0.1 2023-09-29 15:09:29,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 15:09:29,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:34,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:09:36,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:39,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:09:39,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 15:09:40,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:40,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 15:09:41,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 15:09:43,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:43,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 15:09:43,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 15:09:43,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 15:09:43,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=403686.6666666667, ans=0.1 2023-09-29 15:09:46,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:09:47,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:09:49,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=403686.6666666667, ans=0.04949747468305833 2023-09-29 15:09:50,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:52,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:54,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:55,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:55,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 15:09:55,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:55,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:57,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:58,522 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.84 vs. limit=22.5 2023-09-29 15:09:59,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 15:10:00,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 15:10:02,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 15:10:06,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:10:09,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:10:09,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 15:10:15,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:19,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:10:19,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:10:19,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:10:19,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 15:10:19,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:10:21,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:21,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:10:22,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:10:22,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:25,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 15:10:25,817 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=403820.0, ans=0.125 2023-09-29 15:10:27,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 15:10:27,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:28,534 INFO [train.py:1039] (0/4) Epoch 12, batch 2150, loss[loss=0.1905, simple_loss=0.2541, pruned_loss=0.0635, over 22858.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.267, pruned_loss=0.06238, over 4724031.37 frames. ], batch size: 322, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:10:28,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:10:28,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:10:30,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:10:30,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:10:35,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=403886.6666666667, ans=0.125 2023-09-29 15:10:37,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:10:37,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:39,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:39,518 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:10:40,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:10:40,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:40,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:10:45,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:47,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:10:47,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:10:51,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:51,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 15:10:54,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=403953.3333333333, ans=0.125 2023-09-29 15:10:56,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:10:57,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:10:59,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:59,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:00,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:00,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:11:00,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=404020.0, ans=0.2 2023-09-29 15:11:01,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:01,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:11:02,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:11:03,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 15:11:05,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:11:07,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:07,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:08,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:11:12,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:11:12,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=404020.0, ans=0.07 2023-09-29 15:11:14,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:14,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:11:15,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:15,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 15:11:15,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:11:18,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:18,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:21,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:22,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:11:22,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:24,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:24,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 15:11:26,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 15:11:27,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:11:27,809 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 15:11:27,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:29,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:11:29,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 15:11:29,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:11:29,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 15:11:30,762 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 15:11:30,763 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 15:11:30,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 15:11:33,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:33,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:33,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:11:33,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:35,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:11:37,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=404153.3333333333, ans=0.125 2023-09-29 15:11:38,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:38,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:48,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:11:49,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 15:11:50,276 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.936e+02 2.140e+02 2.613e+02 4.157e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 15:11:51,874 INFO [train.py:1039] (0/4) Epoch 12, batch 2200, loss[loss=0.1938, simple_loss=0.2756, pruned_loss=0.05598, over 24473.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2676, pruned_loss=0.06269, over 4704685.28 frames. ], batch size: 69, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:11:52,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:11:57,894 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.63 vs. limit=15.0 2023-09-29 15:11:59,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:59,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:12:00,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:02,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:12:04,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:12:04,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:12:04,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 15:12:09,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 15:12:11,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:12:18,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 15:12:22,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:24,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:24,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:12:24,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=404353.3333333333, ans=0.1 2023-09-29 15:12:25,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=404353.3333333333, ans=0.125 2023-09-29 15:12:27,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:12:28,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 15:12:32,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:12:33,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:34,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:12:37,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:12:37,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:41,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=404420.0, ans=0.1 2023-09-29 15:12:42,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:12:43,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:45,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 15:12:45,712 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=404420.0, ans=0.125 2023-09-29 15:12:46,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:47,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 15:12:50,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:50,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:12:52,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:54,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:55,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:55,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:55,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:57,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:12:57,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:13:00,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:13:03,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:13:04,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:07,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:13:08,852 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 15:13:12,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:13:12,397 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 15:13:13,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:13:13,992 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 15:13:14,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:15,598 INFO [train.py:1039] (0/4) Epoch 12, batch 2250, loss[loss=0.2726, simple_loss=0.3194, pruned_loss=0.1129, over 19407.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2681, pruned_loss=0.06279, over 4713772.32 frames. ], batch size: 388, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:13:15,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:13:17,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:18,902 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 15:13:20,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:13:23,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:29,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:13:29,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:13:30,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=404553.3333333333, ans=0.125 2023-09-29 15:13:32,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:34,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:35,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:36,652 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.49 vs. limit=15.0 2023-09-29 15:13:38,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 15:13:38,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:13:38,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:13:40,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 15:13:40,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:13:42,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:44,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:50,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:52,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:13:52,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:13:55,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 15:13:56,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:56,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:14:03,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:05,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:06,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:14:06,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:14:07,036 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=404753.3333333333, ans=0.125 2023-09-29 15:14:09,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:14:11,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:14:11,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=404753.3333333333, ans=0.125 2023-09-29 15:14:16,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:14:18,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:14:23,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:14:23,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:14:23,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:14:28,864 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.98 vs. limit=22.5 2023-09-29 15:14:29,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:14:31,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:14:31,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 15:14:31,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:32,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:14:34,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 15:14:36,004 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.000e+02 2.393e+02 2.961e+02 4.518e+02, threshold=4.786e+02, percent-clipped=2.0 2023-09-29 15:14:38,327 INFO [train.py:1039] (0/4) Epoch 12, batch 2300, loss[loss=0.262, simple_loss=0.3119, pruned_loss=0.106, over 19516.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.269, pruned_loss=0.06342, over 4709376.51 frames. ], batch size: 389, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:14:38,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:14:38,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:45,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:46,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:14:46,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=404886.6666666667, ans=0.125 2023-09-29 15:14:49,621 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 15:14:51,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:00,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:15:00,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:15:00,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:02,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:02,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 15:15:02,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:15:05,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:05,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:15:08,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:15:08,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=405020.0, ans=10.0 2023-09-29 15:15:13,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:15:14,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=405020.0, ans=0.125 2023-09-29 15:15:15,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:20,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:15:20,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:23,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:15:26,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:15:27,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=405086.6666666667, ans=0.125 2023-09-29 15:15:30,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:32,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:15:32,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:15:32,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 15:15:36,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:15:36,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:36,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:36,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:15:38,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:38,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:15:38,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:15:39,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 15:15:39,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:15:39,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:41,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 15:15:50,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:15:51,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:15:57,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:58,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:15:58,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:15:58,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:16:00,273 INFO [train.py:1039] (0/4) Epoch 12, batch 2350, loss[loss=0.2705, simple_loss=0.3164, pruned_loss=0.1123, over 19484.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2705, pruned_loss=0.06414, over 4707796.83 frames. ], batch size: 388, lr: 8.67e-03, grad_scale: 8.0 2023-09-29 15:16:00,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:00,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:16:01,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 15:16:04,065 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=405220.0, ans=0.0 2023-09-29 15:16:07,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:07,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 15:16:13,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 15:16:16,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:16:21,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:23,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:24,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 15:16:28,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:16:33,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=405353.3333333333, ans=0.02 2023-09-29 15:16:35,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 15:16:35,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:38,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:16:38,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:39,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:16:42,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 15:16:43,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:16:44,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:44,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:16:44,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:16:46,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:16:48,864 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.43 vs. limit=15.0 2023-09-29 15:16:49,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 15:16:50,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:55,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:55,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:16:55,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 15:16:56,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:16:59,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 15:16:59,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:17:06,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 15:17:11,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 15:17:12,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:17:12,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:17:14,159 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 15:17:14,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 15:17:17,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 15:17:20,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:17:22,272 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.128e+02 2.368e+02 3.787e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 15:17:22,316 INFO [train.py:1039] (0/4) Epoch 12, batch 2400, loss[loss=0.197, simple_loss=0.2625, pruned_loss=0.06577, over 23365.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2696, pruned_loss=0.06333, over 4712087.76 frames. ], batch size: 119, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:17:25,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:17:27,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:17:30,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:17:30,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 15:17:30,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 15:17:40,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:17:40,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:17:40,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 15:17:42,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:17:43,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:43,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 15:17:48,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:50,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 15:17:56,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:18:02,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 15:18:05,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:06,625 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.71 vs. limit=15.0 2023-09-29 15:18:07,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:09,975 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.39 vs. limit=15.0 2023-09-29 15:18:12,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:12,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 15:18:12,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:18:20,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:23,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:18:26,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:26,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:18:26,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:18:28,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:18:28,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:28,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:28,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:18:33,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:18:35,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:18:35,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 15:18:37,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 15:18:37,693 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.max_abs, batch_count=405820.0, ans=10.0 2023-09-29 15:18:39,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:39,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:39,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 15:18:41,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 15:18:41,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 15:18:41,457 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 15:18:43,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 15:18:44,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:46,474 INFO [train.py:1039] (0/4) Epoch 12, batch 2450, loss[loss=0.1861, simple_loss=0.228, pruned_loss=0.07214, over 19110.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2681, pruned_loss=0.06266, over 4711463.64 frames. ], batch size: 388, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:18:46,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:46,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:46,679 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 15:18:48,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:49,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:18:54,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:18:54,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:57,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:57,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:59,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 15:19:02,926 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.51 vs. limit=22.5 2023-09-29 15:19:05,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:05,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:07,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=405953.3333333333, ans=0.1 2023-09-29 15:19:08,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:19:08,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:19:08,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:19:08,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 15:19:12,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=405953.3333333333, ans=0.125 2023-09-29 15:19:14,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:17,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:19:19,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:19:21,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:19:21,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:21,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=406020.0, ans=0.1 2023-09-29 15:19:24,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:24,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:19:26,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 15:19:27,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:19:31,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=406020.0, ans=0.0 2023-09-29 15:19:32,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:34,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:35,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:35,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:19:35,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:37,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:19:38,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 15:19:40,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:40,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:19:45,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:19:45,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:51,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:19:51,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 15:19:52,491 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.62 vs. limit=15.0 2023-09-29 15:19:53,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:19:53,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:54,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 15:19:54,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:19:56,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:20:00,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:20:03,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:03,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:20:04,894 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.76 vs. limit=15.0 2023-09-29 15:20:07,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 15:20:08,457 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.958e+02 2.224e+02 2.647e+02 3.379e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 15:20:08,500 INFO [train.py:1039] (0/4) Epoch 12, batch 2500, loss[loss=0.2057, simple_loss=0.2655, pruned_loss=0.07292, over 23800.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2673, pruned_loss=0.06233, over 4707399.65 frames. ], batch size: 212, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:20:08,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:20:15,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:18,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=406220.0, ans=0.125 2023-09-29 15:20:25,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:20:25,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:20:25,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=406286.6666666667, ans=0.125 2023-09-29 15:20:26,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:26,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 15:20:31,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=406286.6666666667, ans=0.0 2023-09-29 15:20:33,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:20:34,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:20:36,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:20:36,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:20:36,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 15:20:39,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:39,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:39,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 15:20:39,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:40,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 15:20:41,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:20:47,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:47,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:51,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:20:51,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 15:20:53,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:20:54,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:58,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:03,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:07,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:09,184 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.65 vs. limit=10.0 2023-09-29 15:21:13,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:21:15,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 15:21:16,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:16,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:19,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:21:19,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:21:21,166 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 15:21:21,167 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 15:21:21,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 15:21:24,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:26,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 15:21:26,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 15:21:28,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=406486.6666666667, ans=0.125 2023-09-29 15:21:29,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:29,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 15:21:31,828 INFO [train.py:1039] (0/4) Epoch 12, batch 2550, loss[loss=0.1798, simple_loss=0.2557, pruned_loss=0.05197, over 24570.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2673, pruned_loss=0.06216, over 4714864.72 frames. ], batch size: 60, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:21:33,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 15:21:35,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=406553.3333333333, ans=0.125 2023-09-29 15:21:37,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:38,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:21:38,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:21:41,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:43,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 15:21:43,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:21:46,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 15:21:47,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=406620.0, ans=0.125 2023-09-29 15:21:48,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:21:51,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:52,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:52,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 15:21:52,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:21:52,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:21:54,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:57,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:21:57,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 15:21:57,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:57,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:57,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 15:22:14,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:22:17,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=406686.6666666667, ans=0.125 2023-09-29 15:22:18,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:18,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:18,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:22:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:22:26,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:22:30,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:22:30,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:22:31,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:22:31,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:22:31,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:22:31,283 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=406753.3333333333, ans=0.125 2023-09-29 15:22:37,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:37,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:41,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:22:41,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 15:22:41,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:22:43,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:44,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:22:46,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:22:46,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:52,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:22:54,378 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.975e+02 2.307e+02 2.705e+02 3.697e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-29 15:22:54,422 INFO [train.py:1039] (0/4) Epoch 12, batch 2600, loss[loss=0.2095, simple_loss=0.273, pruned_loss=0.07295, over 23612.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2683, pruned_loss=0.06316, over 4719978.02 frames. ], batch size: 256, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:22:54,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:54,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=406886.6666666667, ans=0.1 2023-09-29 15:22:57,875 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 15:23:02,202 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 15:23:02,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:23:02,297 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 15:23:03,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 15:23:03,830 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 15:23:06,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:23:06,871 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 15:23:09,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 15:23:11,276 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 15:23:12,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:23:14,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 15:23:17,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 15:23:19,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:23:19,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 15:23:22,930 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 15:23:22,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 15:23:26,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=407020.0, ans=0.125 2023-09-29 15:23:30,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:30,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:30,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:30,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 15:23:33,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:23:39,939 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 15:23:45,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:46,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:46,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=407086.6666666667, ans=0.0 2023-09-29 15:23:47,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 15:23:49,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:23:49,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:50,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 15:23:51,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=407086.6666666667, ans=0.0 2023-09-29 15:23:53,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:23:53,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:23:58,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:01,971 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.29 vs. limit=6.0 2023-09-29 15:24:02,628 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 15:24:02,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:02,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:24:07,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:24:08,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:24:08,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 15:24:10,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:24:11,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:11,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:14,694 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.75 vs. limit=10.0 2023-09-29 15:24:16,573 INFO [train.py:1039] (0/4) Epoch 12, batch 2650, loss[loss=0.212, simple_loss=0.2923, pruned_loss=0.0658, over 24348.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2689, pruned_loss=0.06322, over 4722863.70 frames. ], batch size: 77, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:24:16,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 15:24:18,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:18,986 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.82 vs. limit=22.5 2023-09-29 15:24:21,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:24:27,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 15:24:27,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:28,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:24:28,646 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 15:24:28,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:24:30,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:32,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:24:34,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:37,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:37,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 15:24:38,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:24:38,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:24:41,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 15:24:41,868 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=407286.6666666667, ans=10.0 2023-09-29 15:24:43,142 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 15:24:44,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:47,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 15:24:47,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:24:49,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 15:24:55,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:55,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:24:55,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:57,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:24:58,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=407353.3333333333, ans=0.1 2023-09-29 15:25:01,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 15:25:01,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 15:25:06,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:08,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 15:25:08,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:10,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:10,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=407420.0, ans=0.0 2023-09-29 15:25:11,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:11,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:11,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:13,760 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.87 vs. limit=12.0 2023-09-29 15:25:14,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:16,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:16,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:25:16,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:25:16,693 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=407420.0, ans=0.125 2023-09-29 15:25:17,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:25:19,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:20,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:25:21,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:23,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:23,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:25:25,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:27,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:25:27,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:27,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 15:25:29,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=407486.6666666667, ans=0.125 2023-09-29 15:25:31,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=407486.6666666667, ans=0.1 2023-09-29 15:25:33,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:34,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:34,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:37,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:38,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:39,914 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.894e+02 2.126e+02 2.422e+02 3.822e+02, threshold=4.252e+02, percent-clipped=0.0 2023-09-29 15:25:39,969 INFO [train.py:1039] (0/4) Epoch 12, batch 2700, loss[loss=0.1992, simple_loss=0.2766, pruned_loss=0.06087, over 23451.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2701, pruned_loss=0.06393, over 4715969.61 frames. ], batch size: 93, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:25:40,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:41,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:25:42,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 15:25:46,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:25:47,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:25:49,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:49,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:49,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:50,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:25:50,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:50,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:25:50,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:25:52,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 15:25:52,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:25:52,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=407553.3333333333, ans=0.2 2023-09-29 15:25:52,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=407553.3333333333, ans=0.2 2023-09-29 15:25:53,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:54,444 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.59 vs. limit=22.5 2023-09-29 15:25:55,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:25:56,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:26:00,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:26:00,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 15:26:02,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:02,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=407620.0, ans=0.1 2023-09-29 15:26:08,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:26:08,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:09,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=407620.0, ans=0.125 2023-09-29 15:26:14,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:26:14,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:26:14,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:26:14,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:26:18,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:19,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:19,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:26:19,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:26:25,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:25,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:26:35,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:26:35,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:26:40,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:26:40,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:44,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:44,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:46,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:48,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:26:48,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=407820.0, ans=0.0 2023-09-29 15:26:48,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=407820.0, ans=0.125 2023-09-29 15:26:49,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:51,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:26:52,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:54,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:54,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:56,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 15:26:57,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:59,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:26:59,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 15:27:00,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 15:27:00,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:02,261 INFO [train.py:1039] (0/4) Epoch 12, batch 2750, loss[loss=0.1948, simple_loss=0.2794, pruned_loss=0.05514, over 24595.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2709, pruned_loss=0.0639, over 4712408.72 frames. ], batch size: 71, lr: 8.64e-03, grad_scale: 16.0 2023-09-29 15:27:03,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:06,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:08,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:08,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:27:08,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:12,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:12,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=407886.6666666667, ans=0.0 2023-09-29 15:27:14,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:27:14,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:27:14,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:14,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 15:27:14,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:27:14,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:21,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 15:27:23,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:27:23,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:24,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:27:25,043 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=407953.3333333333, ans=0.1 2023-09-29 15:27:26,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:27:27,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:27,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:27:27,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:29,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:32,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:27:32,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:27:32,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:27:33,672 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.07 vs. limit=15.0 2023-09-29 15:27:34,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:34,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:27:41,486 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=408020.0, ans=0.125 2023-09-29 15:27:43,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:46,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:27:46,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:48,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=408020.0, ans=0.09899494936611666 2023-09-29 15:27:51,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:51,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:27:51,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:27:59,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:28:00,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:28:00,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 15:28:02,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=408086.6666666667, ans=0.1 2023-09-29 15:28:02,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=408086.6666666667, ans=0.125 2023-09-29 15:28:05,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:08,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 15:28:13,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:28:14,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:28:16,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 15:28:18,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:28:18,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:28:20,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 15:28:20,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:28:24,680 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.991e+02 2.173e+02 2.887e+02 4.719e+02, threshold=4.346e+02, percent-clipped=2.0 2023-09-29 15:28:24,722 INFO [train.py:1039] (0/4) Epoch 12, batch 2800, loss[loss=0.1891, simple_loss=0.2777, pruned_loss=0.05023, over 24558.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2698, pruned_loss=0.06327, over 4726367.48 frames. ], batch size: 71, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:28:24,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:28:24,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:24,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:28:26,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 15:28:26,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:27,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:29,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:30,025 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 15:28:30,026 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 15:28:33,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:36,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:28:36,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:28:39,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:28:39,567 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=408286.6666666667, ans=10.0 2023-09-29 15:28:41,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 15:28:42,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=408286.6666666667, ans=0.2 2023-09-29 15:28:44,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:28:44,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 15:28:45,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:45,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:28:45,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:28:49,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:28:49,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:49,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:28:51,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:28:59,635 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.08 vs. limit=22.5 2023-09-29 15:29:00,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:29:02,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:05,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:05,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:29:06,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:13,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:13,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 15:29:14,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:14,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:29:15,634 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.69 vs. limit=15.0 2023-09-29 15:29:19,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:19,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:24,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:27,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:29:27,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:27,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:29:28,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:29:28,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:29:30,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:29:30,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 15:29:30,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:30,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=408486.6666666667, ans=0.0 2023-09-29 15:29:31,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:29:31,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:33,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 15:29:35,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:35,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:29:36,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:29:39,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 15:29:45,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:45,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:29:45,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:29:48,303 INFO [train.py:1039] (0/4) Epoch 12, batch 2850, loss[loss=0.1929, simple_loss=0.2735, pruned_loss=0.05615, over 24637.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2691, pruned_loss=0.0629, over 4736446.97 frames. ], batch size: 65, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:29:48,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:29:51,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:29:51,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:53,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:56,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:56,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:59,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:30:00,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 15:30:06,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 15:30:06,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:08,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 15:30:10,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:12,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 15:30:14,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 15:30:15,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:17,824 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=12.0 2023-09-29 15:30:20,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=408686.6666666667, ans=0.125 2023-09-29 15:30:21,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=408686.6666666667, ans=0.1 2023-09-29 15:30:27,324 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.88 vs. limit=6.0 2023-09-29 15:30:29,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:31,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:31,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:30:31,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=408686.6666666667, ans=0.07 2023-09-29 15:30:32,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:30:32,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:30:33,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:30:34,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=408686.6666666667, ans=0.125 2023-09-29 15:30:36,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:30:36,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 15:30:38,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:30:38,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:30:38,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:38,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:41,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:41,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:43,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:45,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:45,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=408753.3333333333, ans=0.0 2023-09-29 15:30:48,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:30:48,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:50,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:52,426 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.42 vs. limit=15.0 2023-09-29 15:30:53,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:30:57,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:30:59,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 15:31:00,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 15:31:02,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:31:02,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:03,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 15:31:03,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:31:05,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:05,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:05,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:31:05,238 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 15:31:07,270 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 15:31:07,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:07,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:10,762 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.902e+02 2.094e+02 2.569e+02 4.916e+02, threshold=4.188e+02, percent-clipped=1.0 2023-09-29 15:31:10,806 INFO [train.py:1039] (0/4) Epoch 12, batch 2900, loss[loss=0.2027, simple_loss=0.2714, pruned_loss=0.067, over 23624.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2688, pruned_loss=0.06301, over 4732952.03 frames. ], batch size: 149, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:31:12,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:12,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:12,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:31:14,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 15:31:17,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:19,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 15:31:20,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 15:31:22,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:31:22,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:31:26,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:26,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:31:31,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:31,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:34,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:31:35,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 15:31:35,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:31:37,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:40,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 15:31:40,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 15:31:44,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:44,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 15:31:44,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:31:47,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:31:49,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:52,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:53,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:56,228 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.60 vs. limit=15.0 2023-09-29 15:31:57,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:02,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:02,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 15:32:04,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 15:32:04,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:32:08,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:32:10,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 15:32:11,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:32:16,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:32:24,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:32:24,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:32:25,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 15:32:27,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:27,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 15:32:27,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=409153.3333333333, ans=0.125 2023-09-29 15:32:28,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:28,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:32:34,391 INFO [train.py:1039] (0/4) Epoch 12, batch 2950, loss[loss=0.1999, simple_loss=0.2673, pruned_loss=0.06629, over 23785.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2695, pruned_loss=0.0637, over 4718441.17 frames. ], batch size: 212, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:32:34,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:37,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 15:32:37,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:37,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:39,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:32:39,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=409220.0, ans=0.1 2023-09-29 15:32:40,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:32:42,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 15:32:42,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 15:32:43,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:32:43,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:50,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:32:52,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:32:55,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:55,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:32:58,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:32:58,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:33:02,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:03,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:03,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:33:05,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 15:33:09,018 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.92 vs. limit=5.0 2023-09-29 15:33:09,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 15:33:09,472 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 15:33:10,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:33:12,495 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 15:33:15,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 15:33:15,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:33:16,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:33:16,868 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 15:33:16,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:33:19,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 15:33:20,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:33:20,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:33:23,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:25,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:33:25,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:27,377 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 15:33:27,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:27,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 15:33:34,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:37,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:33:37,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=409420.0, ans=0.125 2023-09-29 15:33:38,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 15:33:38,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:33:38,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 15:33:41,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:42,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=409486.6666666667, ans=0.2 2023-09-29 15:33:43,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:33:45,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:33:46,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:46,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:33:47,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:33:48,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:48,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:33:50,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:33:50,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:51,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:33:53,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:53,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 15:33:54,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:56,524 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.935e+02 2.190e+02 2.674e+02 3.950e+02, threshold=4.379e+02, percent-clipped=0.0 2023-09-29 15:33:56,587 INFO [train.py:1039] (0/4) Epoch 12, batch 3000, loss[loss=0.1748, simple_loss=0.2541, pruned_loss=0.04777, over 24657.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2698, pruned_loss=0.06329, over 4725866.88 frames. ], batch size: 65, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:33:56,588 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 15:34:11,472 INFO [train.py:1071] (0/4) Epoch 12, validation: loss=0.2606, simple_loss=0.2686, pruned_loss=0.1263, over 1125622.00 frames. 2023-09-29 15:34:11,473 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 15:34:14,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:34:14,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:34:19,196 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 15:34:19,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 15:34:20,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:34:20,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:34:22,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 15:34:22,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:24,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=409553.3333333333, ans=0.0 2023-09-29 15:34:29,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:34:39,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:34:47,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 15:34:47,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:34:50,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:34:52,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:52,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:34:53,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:34:53,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 15:34:56,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 15:34:58,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:34:59,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:35:02,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:35:02,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:04,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:04,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:35:07,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:35:08,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:35:08,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:35:10,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:13,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 15:35:15,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:35:16,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:16,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:35:18,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=409820.0, ans=0.07 2023-09-29 15:35:20,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:22,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:23,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:35:23,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 15:35:23,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:35:25,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 15:35:25,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=409820.0, ans=0.2 2023-09-29 15:35:26,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:35:28,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 15:35:31,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:35:32,696 INFO [train.py:1039] (0/4) Epoch 12, batch 3050, loss[loss=0.1748, simple_loss=0.2522, pruned_loss=0.04873, over 24311.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2697, pruned_loss=0.0629, over 4723619.23 frames. ], batch size: 56, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:35:32,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:35:34,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 15:35:34,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 15:35:34,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:35:35,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:35:36,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:38,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:35:38,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:38,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:35:39,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 15:35:40,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=409886.6666666667, ans=0.0 2023-09-29 15:35:41,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:35:43,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:35:43,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:35:48,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:51,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 15:35:59,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 15:35:59,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 15:35:59,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:01,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:36:04,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:05,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:05,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:07,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:08,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:36:08,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:10,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:10,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:12,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:14,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:15,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:17,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 15:36:17,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:17,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:36:20,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:36:20,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:36:22,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:36:22,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:28,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:28,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:37,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:37,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:36:37,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:40,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:40,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:36:42,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:42,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 15:36:44,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:44,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:47,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 15:36:47,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=410153.3333333333, ans=0.1 2023-09-29 15:36:51,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:54,188 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.862e+02 2.008e+02 2.262e+02 2.937e+02, threshold=4.017e+02, percent-clipped=0.0 2023-09-29 15:36:54,241 INFO [train.py:1039] (0/4) Epoch 12, batch 3100, loss[loss=0.1979, simple_loss=0.2643, pruned_loss=0.06572, over 23351.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2699, pruned_loss=0.06351, over 4725410.31 frames. ], batch size: 119, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:36:54,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:56,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:36:59,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:36:59,338 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=410220.0, ans=0.125 2023-09-29 15:37:01,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 15:37:03,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 15:37:05,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 15:37:07,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:37:10,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:37:10,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:12,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:37:15,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:19,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=410286.6666666667, ans=0.0 2023-09-29 15:37:22,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 15:37:26,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:37:27,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:27,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:27,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:37:29,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:37:30,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:37:30,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 15:37:30,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:37:32,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:32,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 15:37:33,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:37:37,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=410353.3333333333, ans=0.125 2023-09-29 15:37:39,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:37:40,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 15:37:41,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 15:37:42,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:42,693 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.60 vs. limit=15.0 2023-09-29 15:37:43,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:45,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:45,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:46,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:37:46,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:37:46,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:37:50,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:37:50,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:37:50,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:50,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 15:37:56,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:57,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 15:37:58,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:37:59,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 15:37:59,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:59,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:00,162 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:38:01,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 15:38:12,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 15:38:14,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:15,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:16,982 INFO [train.py:1039] (0/4) Epoch 12, batch 3150, loss[loss=0.1986, simple_loss=0.2841, pruned_loss=0.05648, over 24281.00 frames. ], tot_loss[loss=0.1976, simple_loss=0.2688, pruned_loss=0.06324, over 4719314.01 frames. ], batch size: 74, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:38:17,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:38:17,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:38:17,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 15:38:18,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:20,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:38:22,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 15:38:24,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:27,342 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 15:38:29,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 15:38:31,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:38:32,559 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 15:38:34,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:38:34,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 15:38:34,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=410620.0, ans=0.1 2023-09-29 15:38:35,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 15:38:35,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 15:38:35,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:35,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:38:37,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:37,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=410620.0, ans=0.125 2023-09-29 15:38:40,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 15:38:41,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:41,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:44,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:46,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:38:49,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 15:38:50,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:38:53,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:38:53,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:53,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 15:38:58,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 15:38:58,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:39:00,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:39:00,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:39:01,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:01,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:39:02,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:39:02,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:39:04,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 15:39:06,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:39:06,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:07,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:39:07,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:39:07,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 15:39:07,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:09,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 15:39:10,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:11,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 15:39:12,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 15:39:12,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:39:14,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:14,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 15:39:16,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:39:17,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:19,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:39:21,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:21,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:39:27,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:39:27,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:31,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 15:39:37,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:39:37,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:39:41,206 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.037e+02 2.402e+02 2.745e+02 4.943e+02, threshold=4.804e+02, percent-clipped=1.0 2023-09-29 15:39:41,248 INFO [train.py:1039] (0/4) Epoch 12, batch 3200, loss[loss=0.2021, simple_loss=0.2876, pruned_loss=0.05828, over 24360.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.267, pruned_loss=0.0627, over 4713672.11 frames. ], batch size: 74, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:39:42,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:43,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:39:43,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 15:39:43,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=410886.6666666667, ans=0.2 2023-09-29 15:39:45,383 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.44 vs. limit=15.0 2023-09-29 15:39:46,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:49,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:39:52,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:40:00,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=410953.3333333333, ans=0.125 2023-09-29 15:40:03,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:40:14,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 15:40:15,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:40:18,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 15:40:20,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:40:23,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:40:23,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:40:25,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:40:29,260 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=411086.6666666667, ans=22.5 2023-09-29 15:40:29,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 15:40:31,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:40:33,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 15:40:35,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 15:40:37,421 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:40:38,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:40:43,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:40:45,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,384 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 15:40:45,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:40:49,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:40:50,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 15:40:50,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=411153.3333333333, ans=0.125 2023-09-29 15:40:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 15:40:54,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 15:40:54,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 15:40:55,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:41:00,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:41:00,114 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 15:41:00,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:00,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:01,666 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 15:41:03,202 INFO [train.py:1039] (0/4) Epoch 12, batch 3250, loss[loss=0.2132, simple_loss=0.2762, pruned_loss=0.07517, over 23532.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2684, pruned_loss=0.06271, over 4732710.13 frames. ], batch size: 256, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:41:06,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:41:10,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:15,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=411220.0, ans=0.0 2023-09-29 15:41:19,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:41:19,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 15:41:20,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:20,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:41:20,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:23,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:23,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:41:25,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:26,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:41:26,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:26,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:26,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:28,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:41:30,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:32,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:34,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:35,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:35,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=411353.3333333333, ans=0.0 2023-09-29 15:41:35,746 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.12 vs. limit=15.0 2023-09-29 15:41:37,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:37,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:37,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:41:41,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=411353.3333333333, ans=0.125 2023-09-29 15:41:43,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 15:41:44,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:45,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:41:46,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:47,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:41:52,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:42:00,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:00,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:00,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 15:42:00,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:42:00,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:42:00,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:01,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=411420.0, ans=0.2 2023-09-29 15:42:03,506 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=411420.0, ans=0.0 2023-09-29 15:42:04,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 15:42:04,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 15:42:06,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:42:07,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:07,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:09,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:42:09,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:09,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=411486.6666666667, ans=0.0 2023-09-29 15:42:14,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:14,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:17,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 15:42:17,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:20,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:42:20,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 15:42:24,240 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.64 vs. limit=15.0 2023-09-29 15:42:25,038 INFO [train.py:1039] (0/4) Epoch 12, batch 3300, loss[loss=0.2097, simple_loss=0.2736, pruned_loss=0.07284, over 22633.00 frames. ], tot_loss[loss=0.1978, simple_loss=0.2693, pruned_loss=0.06318, over 4733082.69 frames. ], batch size: 322, lr: 8.61e-03, grad_scale: 16.0 2023-09-29 15:42:25,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:25,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 15:42:25,485 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=411553.3333333333, ans=0.125 2023-09-29 15:42:26,579 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.957e+02 2.272e+02 2.906e+02 4.656e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 15:42:26,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 15:42:26,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=411553.3333333333, ans=0.125 2023-09-29 15:42:28,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 15:42:28,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:33,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:34,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:42:36,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:37,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:42:37,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:42:40,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:42,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:46,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 15:42:46,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:42:46,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:48,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:48,580 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 15:42:50,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:42:50,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:42:51,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:42:51,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:42:53,183 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 15:42:55,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:55,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:42:58,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:58,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 15:43:01,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 15:43:01,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:02,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:43:05,232 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 15:43:06,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 15:43:08,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:43:11,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 15:43:13,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:15,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:43:16,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:20,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:20,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:20,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:43:20,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:43:20,515 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=411753.3333333333, ans=0.125 2023-09-29 15:43:21,775 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=411753.3333333333, ans=0.0 2023-09-29 15:43:24,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:43:24,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:26,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:43:27,577 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 15:43:29,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 15:43:30,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:43:30,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:43:30,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:34,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:34,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:35,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:43:36,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:36,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:43:37,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:39,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:43:42,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 15:43:42,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:44,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:47,607 INFO [train.py:1039] (0/4) Epoch 12, batch 3350, loss[loss=0.2053, simple_loss=0.2714, pruned_loss=0.06958, over 23639.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2709, pruned_loss=0.06394, over 4729822.96 frames. ], batch size: 149, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:43:47,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:43:47,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:49,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:49,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=411886.6666666667, ans=0.0 2023-09-29 15:43:51,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:51,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:52,737 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:43:54,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:54,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=411886.6666666667, ans=0.125 2023-09-29 15:43:56,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:56,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=411886.6666666667, ans=0.2 2023-09-29 15:43:57,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:43:57,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=411886.6666666667, ans=0.125 2023-09-29 15:44:00,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:00,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=411886.6666666667, ans=0.125 2023-09-29 15:44:02,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:44:03,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:05,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:44:05,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 15:44:05,410 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 15:44:06,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:08,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 15:44:08,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 15:44:12,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:44:12,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:44:12,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:12,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 15:44:14,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:15,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:44:16,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:18,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:18,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:20,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:44:23,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:28,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:28,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:32,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:44:33,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:35,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:36,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:37,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:40,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 15:44:40,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:44:42,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 15:44:42,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:44:44,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 15:44:45,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:46,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:51,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=412086.6666666667, ans=0.125 2023-09-29 15:44:53,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:53,520 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=412153.3333333333, ans=0.0 2023-09-29 15:44:55,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 15:44:55,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:44:55,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:44:56,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:45:02,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:05,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 15:45:05,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:45:05,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:45:08,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:08,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 15:45:09,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:45:09,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 15:45:10,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=412220.0, ans=0.125 2023-09-29 15:45:11,200 INFO [train.py:1039] (0/4) Epoch 12, batch 3400, loss[loss=0.1958, simple_loss=0.2646, pruned_loss=0.06351, over 23391.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.2718, pruned_loss=0.06458, over 4721030.84 frames. ], batch size: 119, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:45:11,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:11,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:11,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:45:13,409 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.868e+02 2.094e+02 2.439e+02 4.049e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 15:45:13,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:45:15,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 15:45:18,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 15:45:18,166 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 15:45:18,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:45:20,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=412220.0, ans=0.125 2023-09-29 15:45:21,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:21,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:45:23,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:25,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:45:28,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=412286.6666666667, ans=0.0 2023-09-29 15:45:33,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:45:36,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 15:45:42,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:45:44,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:44,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:46,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:45:50,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:45:54,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 15:45:55,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=412353.3333333333, ans=0.2 2023-09-29 15:45:56,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=412353.3333333333, ans=0.1 2023-09-29 15:46:00,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:01,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:01,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=412420.0, ans=0.0 2023-09-29 15:46:03,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 15:46:04,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:04,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:06,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:46:06,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:46:09,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:46:15,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:46:16,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:46:18,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=412486.6666666667, ans=0.0 2023-09-29 15:46:20,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:22,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 15:46:29,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:46:33,843 INFO [train.py:1039] (0/4) Epoch 12, batch 3450, loss[loss=0.1636, simple_loss=0.2379, pruned_loss=0.04464, over 24568.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2711, pruned_loss=0.06417, over 4728487.44 frames. ], batch size: 60, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:46:33,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 15:46:37,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 15:46:37,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:39,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:46:41,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 15:46:42,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:44,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:46:50,051 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=412620.0, ans=0.125 2023-09-29 15:46:51,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:46:52,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:46:52,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:46:52,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:53,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=412620.0, ans=0.1 2023-09-29 15:46:54,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:57,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=412620.0, ans=0.0 2023-09-29 15:46:57,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=412620.0, ans=0.09899494936611666 2023-09-29 15:47:00,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 15:47:07,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 15:47:07,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:47:07,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:47:10,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:16,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 15:47:16,948 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.10 vs. limit=15.0 2023-09-29 15:47:17,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:47:21,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:21,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:47:22,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:47:23,323 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:47:24,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:47:27,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 15:47:27,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:28,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:47:31,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:47:34,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 15:47:38,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:47:43,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:47:46,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:49,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:47:53,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:53,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:54,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=412820.0, ans=0.1 2023-09-29 15:47:55,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:47:57,183 INFO [train.py:1039] (0/4) Epoch 12, batch 3500, loss[loss=0.1793, simple_loss=0.259, pruned_loss=0.04975, over 24694.00 frames. ], tot_loss[loss=0.198, simple_loss=0.2689, pruned_loss=0.06361, over 4715011.38 frames. ], batch size: 65, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:47:57,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:57,982 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.16 vs. limit=10.0 2023-09-29 15:47:58,235 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.33 vs. limit=15.0 2023-09-29 15:47:58,594 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.065e+02 2.305e+02 4.202e+02, threshold=4.129e+02, percent-clipped=1.0 2023-09-29 15:48:01,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:02,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=412886.6666666667, ans=0.125 2023-09-29 15:48:04,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:48:05,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 15:48:07,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:48:11,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 15:48:13,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:13,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 15:48:16,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:48:18,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:48:20,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:48:20,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:20,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:48:20,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:22,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:22,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 15:48:27,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:27,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:48:27,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:32,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:34,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 15:48:34,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:37,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:37,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:48:37,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=413020.0, ans=0.125 2023-09-29 15:48:38,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:41,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:48:41,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:42,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 15:48:44,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 15:48:45,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 15:48:45,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:47,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:47,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=413086.6666666667, ans=0.2 2023-09-29 15:48:48,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:48,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:48:51,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:48:53,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:48:57,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:48:59,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 15:48:59,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 15:48:59,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:02,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:04,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:05,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:07,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 15:49:09,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:10,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:49:10,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 15:49:14,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 15:49:17,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:18,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:18,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:18,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:20,010 INFO [train.py:1039] (0/4) Epoch 12, batch 3550, loss[loss=0.1915, simple_loss=0.2637, pruned_loss=0.05971, over 23283.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2671, pruned_loss=0.06353, over 4709417.31 frames. ], batch size: 105, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:49:21,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:49:33,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:33,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:49:39,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:49:40,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:49:42,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:43,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:49:43,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:49:47,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:47,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:49:48,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:48,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:49:50,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:49:55,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:49:55,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:56,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:49:56,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:58,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:49:58,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 15:49:58,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:01,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:03,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:50:10,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:10,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:50:11,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:13,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 15:50:14,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:50:14,885 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.63 vs. limit=15.0 2023-09-29 15:50:15,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 15:50:17,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:50:18,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:50:18,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:50:21,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 15:50:23,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 15:50:30,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:35,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:35,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 15:50:43,734 INFO [train.py:1039] (0/4) Epoch 12, batch 3600, loss[loss=0.2007, simple_loss=0.2661, pruned_loss=0.06766, over 23688.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2671, pruned_loss=0.06291, over 4723633.97 frames. ], batch size: 135, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:50:43,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 15:50:43,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:50:44,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:50:45,967 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.995e+02 2.200e+02 2.637e+02 4.261e+02, threshold=4.399e+02, percent-clipped=1.0 2023-09-29 15:50:47,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:47,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:49,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:50:53,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:50:55,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:57,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:50:57,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:50:58,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:58,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 15:51:02,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:51:02,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:05,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:08,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:11,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:51:11,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:51:12,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 15:51:12,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:15,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:17,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:51:18,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:22,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:22,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:51:22,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=413686.6666666667, ans=0.125 2023-09-29 15:51:24,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 15:51:31,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:51:33,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:51:33,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 15:51:39,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:51:45,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:48,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:56,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:51:56,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:51:56,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 15:51:58,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 15:51:59,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 15:52:03,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:52:03,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:52:04,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 15:52:05,964 INFO [train.py:1039] (0/4) Epoch 12, batch 3650, loss[loss=0.2015, simple_loss=0.2582, pruned_loss=0.07244, over 23682.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2681, pruned_loss=0.06306, over 4734097.52 frames. ], batch size: 164, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:52:06,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:06,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:52:06,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:07,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 15:52:07,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 15:52:10,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:52:13,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 15:52:13,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=413886.6666666667, ans=0.0 2023-09-29 15:52:17,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 15:52:19,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:52:22,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 15:52:24,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 15:52:29,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:52:29,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:52:29,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:52:32,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:52:34,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:34,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 15:52:34,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:52:36,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:36,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 15:52:37,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:52:39,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:52:39,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:42,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:52:43,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 15:52:45,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 15:52:46,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:52:49,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 15:52:50,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:52:50,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:52:57,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:52:59,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:59,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:53:00,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:53:02,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:53:04,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:53:07,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:09,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:09,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:53:12,032 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.69 vs. limit=15.0 2023-09-29 15:53:12,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:53:12,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=414153.3333333333, ans=0.09899494936611666 2023-09-29 15:53:14,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:53:14,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:20,559 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 15:53:23,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:23,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:27,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:53:27,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:28,589 INFO [train.py:1039] (0/4) Epoch 12, batch 3700, loss[loss=0.198, simple_loss=0.2658, pruned_loss=0.06504, over 23669.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2697, pruned_loss=0.06441, over 4720794.26 frames. ], batch size: 232, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:53:28,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:53:28,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:30,802 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.903e+02 2.176e+02 2.360e+02 3.995e+02, threshold=4.353e+02, percent-clipped=0.0 2023-09-29 15:53:31,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 15:53:31,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:32,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:53:36,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:36,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:53:38,643 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.27 vs. limit=15.0 2023-09-29 15:53:39,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:39,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 15:53:39,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:39,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:53:41,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:53:45,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:53:48,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:49,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:49,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:53:51,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:51,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:53:51,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=414286.6666666667, ans=0.125 2023-09-29 15:53:52,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:54,515 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 15:53:57,240 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.20 vs. limit=15.0 2023-09-29 15:54:04,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:54:05,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:54:07,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:54:07,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 15:54:07,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:11,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:11,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 15:54:14,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:16,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:54:17,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:18,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:54:20,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:54:24,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:26,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 15:54:26,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:54:27,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 15:54:31,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:54:31,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:54:35,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:36,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 15:54:38,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:54:38,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:54:39,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:39,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:43,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:44,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 15:54:46,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 15:54:46,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:54:46,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:54:48,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:54:48,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:54:53,606 INFO [train.py:1039] (0/4) Epoch 12, batch 3750, loss[loss=0.2079, simple_loss=0.2773, pruned_loss=0.06922, over 23184.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2717, pruned_loss=0.0654, over 4709308.37 frames. ], batch size: 105, lr: 8.57e-03, grad_scale: 32.0 2023-09-29 15:54:53,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:55,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:54:56,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:54:58,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 15:54:58,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 15:55:01,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:55:03,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 15:55:03,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:55:03,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=414553.3333333333, ans=0.05 2023-09-29 15:55:04,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:04,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:06,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:10,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=414620.0, ans=15.0 2023-09-29 15:55:11,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:14,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:55:16,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:55:21,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:55:21,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=414620.0, ans=0.0 2023-09-29 15:55:22,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:24,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 15:55:24,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:26,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:26,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:28,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 15:55:34,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 15:55:34,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:34,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:37,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:37,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=414686.6666666667, ans=0.0 2023-09-29 15:55:39,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=414686.6666666667, ans=0.125 2023-09-29 15:55:42,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:44,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:55:48,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=414753.3333333333, ans=0.2 2023-09-29 15:55:49,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 15:55:52,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:56,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:56,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:55:59,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=414820.0, ans=0.07 2023-09-29 15:56:01,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:56:03,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:56:04,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:56:06,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:56:07,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:56:09,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:56:09,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=414820.0, ans=0.2 2023-09-29 15:56:15,898 INFO [train.py:1039] (0/4) Epoch 12, batch 3800, loss[loss=0.2324, simple_loss=0.2876, pruned_loss=0.08861, over 23934.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2716, pruned_loss=0.06508, over 4707193.87 frames. ], batch size: 195, lr: 8.57e-03, grad_scale: 8.0 2023-09-29 15:56:19,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:56:19,980 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.79 vs. limit=15.0 2023-09-29 15:56:21,128 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.017e+02 2.225e+02 2.467e+02 3.965e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 15:56:22,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=414886.6666666667, ans=0.95 2023-09-29 15:56:23,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=414886.6666666667, ans=0.125 2023-09-29 15:56:24,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:24,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:56:25,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 15:56:27,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:29,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:31,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:56:33,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:56:33,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:34,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:56:35,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=414953.3333333333, ans=0.125 2023-09-29 15:56:36,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:36,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:56:37,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:39,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 15:56:43,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:56:44,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:56:44,291 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=414953.3333333333, ans=0.1 2023-09-29 15:56:46,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:49,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:56:49,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:56:51,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:56:52,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:54,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:57,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:57:02,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:57:02,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 15:57:03,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:07,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=415086.6666666667, ans=0.1 2023-09-29 15:57:11,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:15,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:57:19,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 15:57:21,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 15:57:21,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:24,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:24,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:26,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 15:57:31,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 15:57:31,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 15:57:31,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:31,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=415153.3333333333, ans=0.125 2023-09-29 15:57:32,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:38,727 INFO [train.py:1039] (0/4) Epoch 12, batch 3850, loss[loss=0.2055, simple_loss=0.287, pruned_loss=0.06197, over 24642.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2707, pruned_loss=0.0643, over 4716628.23 frames. ], batch size: 73, lr: 8.57e-03, grad_scale: 4.0 2023-09-29 15:57:38,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:57:40,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:57:45,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:57:47,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 15:57:47,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:57:47,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:53,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:57:55,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:56,433 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=415286.6666666667, ans=0.125 2023-09-29 15:57:57,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:57:57,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 15:57:57,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=415286.6666666667, ans=0.0 2023-09-29 15:58:05,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:07,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:58:10,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:10,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:58:12,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=415353.3333333333, ans=0.0 2023-09-29 15:58:13,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:13,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:58:15,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:15,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:58:16,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:17,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:19,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:19,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:58:20,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 15:58:22,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 15:58:22,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:22,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:25,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:27,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:27,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 15:58:30,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 15:58:32,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:34,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 15:58:36,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:58:42,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:43,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:46,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:48,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 15:58:50,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 15:58:53,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:55,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:57,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:58:57,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:58:58,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:00,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:00,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:59:00,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 15:59:00,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=415553.3333333333, ans=0.0 2023-09-29 15:59:01,986 INFO [train.py:1039] (0/4) Epoch 12, batch 3900, loss[loss=0.211, simple_loss=0.2689, pruned_loss=0.07658, over 23817.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2695, pruned_loss=0.06366, over 4726970.99 frames. ], batch size: 179, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 15:59:02,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:59:03,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 15:59:03,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:03,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:05,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:59:05,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:07,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:59:07,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:07,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:59:07,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:07,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 15:59:08,536 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.940e+02 2.154e+02 2.415e+02 3.457e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 15:59:08,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:11,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=415553.3333333333, ans=0.0 2023-09-29 15:59:12,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:15,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:15,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:59:16,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:17,401 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=415620.0, ans=0.2 2023-09-29 15:59:18,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:18,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:21,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:59:21,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 15:59:21,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:25,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 15:59:25,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:27,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 15:59:29,211 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=415620.0, ans=0.125 2023-09-29 15:59:30,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 15:59:33,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:35,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:35,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:59:35,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:59:42,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:44,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:59:45,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:59:45,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:59:47,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:59:53,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=415753.3333333333, ans=0.125 2023-09-29 15:59:54,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:54,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:00:02,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:00:05,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:00:13,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:15,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=415820.0, ans=0.0 2023-09-29 16:00:17,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:17,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 16:00:20,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 16:00:20,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:20,330 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=415820.0, ans=0.125 2023-09-29 16:00:22,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 16:00:24,268 INFO [train.py:1039] (0/4) Epoch 12, batch 3950, loss[loss=0.1975, simple_loss=0.2803, pruned_loss=0.0574, over 24035.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2685, pruned_loss=0.06313, over 4719421.26 frames. ], batch size: 80, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 16:00:24,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:00:24,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 16:00:32,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:00:33,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 16:00:33,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:00:36,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:00:37,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:00:44,295 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 16:00:45,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:45,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 16:00:45,865 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 16:00:45,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:48,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:48,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:00:48,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:50,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=415953.3333333333, ans=0.1 2023-09-29 16:00:50,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=415953.3333333333, ans=0.0 2023-09-29 16:00:53,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 16:00:55,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:00:56,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:56,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:00:56,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:00:58,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:01:08,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:01:08,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:01:16,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 16:01:22,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 16:01:22,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 16:01:23,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:01:23,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:01:31,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:01:32,193 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=416153.3333333333, ans=0.2 2023-09-29 16:01:33,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:01:33,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:01:33,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:01:33,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 16:01:40,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:01:42,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:01:46,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 16:01:47,925 INFO [train.py:1039] (0/4) Epoch 12, batch 4000, loss[loss=0.1984, simple_loss=0.2613, pruned_loss=0.06777, over 23334.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.269, pruned_loss=0.06341, over 4725507.35 frames. ], batch size: 119, lr: 8.56e-03, grad_scale: 16.0 2023-09-29 16:01:52,003 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=416220.0, ans=0.0 2023-09-29 16:01:55,132 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.007e+02 2.286e+02 2.878e+02 4.961e+02, threshold=4.572e+02, percent-clipped=2.0 2023-09-29 16:01:55,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:01,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:08,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:02:08,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 16:02:09,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:02:10,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 16:02:10,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:02:10,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 16:02:10,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=416286.6666666667, ans=0.125 2023-09-29 16:02:11,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=416286.6666666667, ans=0.0 2023-09-29 16:02:12,401 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=15.0 2023-09-29 16:02:13,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:14,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:02:14,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:02:14,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:02:14,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=416286.6666666667, ans=0.1 2023-09-29 16:02:16,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:16,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:02:18,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:02:21,466 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 16:02:22,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:02:22,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=416353.3333333333, ans=0.125 2023-09-29 16:02:23,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:25,428 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 16:02:27,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:02:27,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:35,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=416353.3333333333, ans=0.0 2023-09-29 16:02:35,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=416353.3333333333, ans=0.0 2023-09-29 16:02:37,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 16:02:37,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:38,976 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=416420.0, ans=0.125 2023-09-29 16:02:40,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:02:41,869 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 16:02:43,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:02:43,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 16:02:43,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:02:43,831 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=416420.0, ans=0.09899494936611666 2023-09-29 16:02:44,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:45,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:02:46,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:02:46,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:02:46,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:48,841 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.63 vs. limit=22.5 2023-09-29 16:02:49,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 16:02:49,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:50,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=416420.0, ans=0.1 2023-09-29 16:02:51,415 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 16:02:58,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:03:01,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 16:03:04,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:03:04,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:04,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:03:05,054 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=416486.6666666667, ans=0.0 2023-09-29 16:03:08,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:11,718 INFO [train.py:1039] (0/4) Epoch 12, batch 4050, loss[loss=0.1961, simple_loss=0.2762, pruned_loss=0.05795, over 24560.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2696, pruned_loss=0.06325, over 4729117.18 frames. ], batch size: 71, lr: 8.55e-03, grad_scale: 16.0 2023-09-29 16:03:11,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:14,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:03:14,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 16:03:16,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:03:16,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:18,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:03:19,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:21,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:24,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=416553.3333333333, ans=0.07 2023-09-29 16:03:25,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:27,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:03:27,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=416620.0, ans=0.125 2023-09-29 16:03:29,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 16:03:30,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:03:31,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:03:36,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:38,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:40,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=416620.0, ans=0.125 2023-09-29 16:03:41,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:03:42,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 16:03:44,874 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 16:03:47,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:03:52,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 16:03:53,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:03:53,334 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=416686.6666666667, ans=0.125 2023-09-29 16:03:56,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:59,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:04:00,605 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=416753.3333333333, ans=15.0 2023-09-29 16:04:01,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:01,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:04:03,690 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.69 vs. limit=12.0 2023-09-29 16:04:04,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:04:07,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 16:04:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:04:09,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:10,071 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=416753.3333333333, ans=0.0 2023-09-29 16:04:11,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 16:04:16,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:20,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=416820.0, ans=0.125 2023-09-29 16:04:23,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 16:04:24,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:04:24,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:04:26,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 16:04:26,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 16:04:26,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:29,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:04:30,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:30,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:04:34,243 INFO [train.py:1039] (0/4) Epoch 12, batch 4100, loss[loss=0.1682, simple_loss=0.2444, pruned_loss=0.04596, over 21112.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2702, pruned_loss=0.06331, over 4721401.88 frames. ], batch size: 46, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:04:37,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 16:04:40,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 16:04:40,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 16:04:42,522 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.020e+02 2.338e+02 2.754e+02 3.996e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 16:04:42,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 16:04:42,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:44,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:44,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:45,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:04:47,227 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 16:04:48,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:50,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:04:51,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:52,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:04:55,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:04:57,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:57,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:04:57,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 16:04:58,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:58,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:04:58,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:04:59,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:59,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 16:05:04,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:05,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 16:05:07,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:05:08,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:05:08,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 16:05:10,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:05:10,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:05:11,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:05:13,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 16:05:15,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:05:16,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:05:18,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 16:05:20,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:05:20,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:23,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:27,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=417086.6666666667, ans=0.2 2023-09-29 16:05:28,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=417086.6666666667, ans=0.0 2023-09-29 16:05:29,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:32,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:35,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:05:36,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=417086.6666666667, ans=0.0 2023-09-29 16:05:42,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:05:42,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:48,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:50,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:05:52,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:54,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:05:55,458 INFO [train.py:1039] (0/4) Epoch 12, batch 4150, loss[loss=0.2217, simple_loss=0.2773, pruned_loss=0.08305, over 23816.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2697, pruned_loss=0.0629, over 4735521.22 frames. ], batch size: 164, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:05:55,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:05:55,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:05:57,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 16:05:59,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:59,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 16:06:00,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 16:06:01,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 16:06:02,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:06:03,356 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=417220.0, ans=0.125 2023-09-29 16:06:06,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:06:06,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:11,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:12,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:14,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:06:16,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:06:17,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:06:17,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:06:22,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:23,718 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.72 vs. limit=10.0 2023-09-29 16:06:25,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=417286.6666666667, ans=0.0 2023-09-29 16:06:27,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:29,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 16:06:29,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 16:06:29,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:06:31,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 16:06:31,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:06:31,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:35,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:36,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:41,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 16:06:46,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:06:46,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:06:47,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 16:06:48,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:49,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=417420.0, ans=0.0 2023-09-29 16:06:50,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 16:06:52,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:06:55,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:55,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:57,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 16:06:57,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:57,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:07:00,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:07:03,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 16:07:03,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:03,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:07:05,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:07:06,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 16:07:06,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:07:06,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 16:07:08,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:07:10,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:10,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 16:07:10,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:07:16,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:07:18,214 INFO [train.py:1039] (0/4) Epoch 12, batch 4200, loss[loss=0.1836, simple_loss=0.2578, pruned_loss=0.05474, over 24504.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2678, pruned_loss=0.06205, over 4722491.59 frames. ], batch size: 58, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:07:18,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 16:07:20,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:07:22,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:25,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:07:26,353 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.937e+02 2.271e+02 2.682e+02 4.339e+02, threshold=4.541e+02, percent-clipped=0.0 2023-09-29 16:07:26,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:26,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:29,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 16:07:29,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=417553.3333333333, ans=0.125 2023-09-29 16:07:32,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 16:07:32,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:35,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:37,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:07:39,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:07:41,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:07:42,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:42,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 16:07:42,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:45,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:45,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:45,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:07:49,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:07:49,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 16:07:49,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:54,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:07:56,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:07:59,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:08:00,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:02,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:08:02,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 16:08:02,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:05,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:08:08,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:08:10,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:17,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:08:18,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 16:08:20,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:27,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:08:27,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:30,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 16:08:34,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:08:39,279 INFO [train.py:1039] (0/4) Epoch 12, batch 4250, loss[loss=0.2195, simple_loss=0.2884, pruned_loss=0.07527, over 23259.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2665, pruned_loss=0.06176, over 4718500.08 frames. ], batch size: 119, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:08:39,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:39,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:08:39,722 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=417886.6666666667, ans=0.0 2023-09-29 16:08:41,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:48,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:08:49,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 16:08:49,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:51,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:57,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:08:59,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=417953.3333333333, ans=0.1 2023-09-29 16:09:02,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:02,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:04,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:09:04,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:05,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:05,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:07,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:09,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:09:10,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:12,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 16:09:16,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 16:09:18,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:18,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:09:19,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:21,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:09:21,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:21,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:25,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=418020.0, ans=0.0 2023-09-29 16:09:26,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:09:26,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:09:30,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=418086.6666666667, ans=0.1 2023-09-29 16:09:31,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:33,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:33,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 16:09:34,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:09:34,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 16:09:36,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:09:37,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:09:40,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:40,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:43,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 16:09:44,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:09:45,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:09:50,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:51,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:54,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:09:55,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:57,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:09:59,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:09:59,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:09:59,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 16:10:01,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:03,605 INFO [train.py:1039] (0/4) Epoch 12, batch 4300, loss[loss=0.1963, simple_loss=0.2595, pruned_loss=0.06654, over 23293.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2658, pruned_loss=0.06113, over 4728141.24 frames. ], batch size: 119, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:10:08,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:10:08,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:11,195 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.977e+02 2.264e+02 2.605e+02 3.860e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 16:10:11,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:11,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=418220.0, ans=0.0 2023-09-29 16:10:18,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:10:18,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 16:10:20,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:10:22,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:10:22,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:10:22,240 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 16:10:25,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:10:29,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:10:32,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 16:10:32,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:10:34,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 16:10:36,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:10:38,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:10:42,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:10:42,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:42,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:10:44,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:45,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:10:45,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 16:10:45,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 16:10:48,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:10:50,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:10:50,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:50,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 16:10:50,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 16:10:52,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 16:10:53,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:10:53,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 16:10:53,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 16:10:55,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:10:57,025 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 16:10:59,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:11:02,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:02,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:11:04,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 16:11:06,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:11:06,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:06,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:06,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:08,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:11:09,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:11:13,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:13,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:14,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:20,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 16:11:22,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:11:24,109 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:11:25,245 INFO [train.py:1039] (0/4) Epoch 12, batch 4350, loss[loss=0.175, simple_loss=0.2552, pruned_loss=0.04741, over 24337.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2663, pruned_loss=0.06158, over 4721026.22 frames. ], batch size: 61, lr: 8.53e-03, grad_scale: 8.0 2023-09-29 16:11:25,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:28,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:30,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:11:30,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:11:34,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:11:36,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=418553.3333333333, ans=0.125 2023-09-29 16:11:39,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:43,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:11:43,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=418620.0, ans=0.125 2023-09-29 16:11:44,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:46,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:11:49,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:11:50,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:11:56,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 16:11:57,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:58,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:58,731 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:12:04,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:07,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 16:12:11,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:12,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:12:13,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=418753.3333333333, ans=0.0 2023-09-29 16:12:18,156 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 16:12:19,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:19,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:12:21,236 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 16:12:21,354 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 16:12:21,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:22,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:12:22,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:12:24,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:25,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:25,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:12:25,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=418753.3333333333, ans=0.125 2023-09-29 16:12:30,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 16:12:30,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:30,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 16:12:31,900 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 16:12:31,907 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 16:12:31,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 16:12:35,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:12:35,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:12:35,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:12:36,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:12:38,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 16:12:41,343 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 16:12:41,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:47,049 INFO [train.py:1039] (0/4) Epoch 12, batch 4400, loss[loss=0.2898, simple_loss=0.3332, pruned_loss=0.1232, over 19403.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2681, pruned_loss=0.06215, over 4728663.07 frames. ], batch size: 388, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:12:47,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:47,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:50,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:50,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 16:12:52,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 16:12:52,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 16:12:53,001 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 16:12:54,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:12:54,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:54,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=418886.6666666667, ans=0.0 2023-09-29 16:12:55,964 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 1.963e+02 2.169e+02 2.661e+02 4.171e+02, threshold=4.339e+02, percent-clipped=0.0 2023-09-29 16:12:56,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 16:12:59,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:00,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:00,889 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 16:13:01,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=418886.6666666667, ans=0.2 2023-09-29 16:13:05,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:05,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 16:13:05,451 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 16:13:09,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 16:13:10,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 16:13:10,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 16:13:10,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:10,468 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=418953.3333333333, ans=0.1 2023-09-29 16:13:11,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:13,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:14,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:16,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 16:13:16,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 16:13:17,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:19,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:13:19,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:21,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:23,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:23,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 16:13:24,611 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 16:13:27,093 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.19 vs. limit=22.5 2023-09-29 16:13:27,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:35,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:36,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 16:13:40,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=419086.6666666667, ans=0.125 2023-09-29 16:13:41,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:13:41,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=419086.6666666667, ans=10.0 2023-09-29 16:13:43,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:13:47,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:13:47,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 16:13:47,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:13:47,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:13:47,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:13:49,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:13:53,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=419153.3333333333, ans=0.0 2023-09-29 16:13:54,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 16:13:56,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=419153.3333333333, ans=0.1 2023-09-29 16:13:57,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 16:13:59,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 16:13:59,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:59,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 16:14:01,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:14:04,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:14:06,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 16:14:09,899 INFO [train.py:1039] (0/4) Epoch 12, batch 4450, loss[loss=0.2276, simple_loss=0.2893, pruned_loss=0.08298, over 23425.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2684, pruned_loss=0.0621, over 4739842.68 frames. ], batch size: 285, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:14:12,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:14:14,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:14,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:14:23,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:24,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:14:26,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:27,401 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.58 vs. limit=15.0 2023-09-29 16:14:28,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:14:32,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:14:33,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:35,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 16:14:35,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:37,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:37,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:14:37,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:14:38,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:14:43,230 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.00 vs. limit=15.0 2023-09-29 16:14:45,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:45,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:47,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:47,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:49,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:14:53,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:14:55,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 16:14:56,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 16:14:56,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:14:58,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:58,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 16:15:02,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:15:06,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:08,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 16:15:08,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:08,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:10,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:15:10,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:15:10,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:12,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=419420.0, ans=0.0 2023-09-29 16:15:13,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:15:15,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 16:15:17,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:15:17,646 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=419486.6666666667, ans=0.0 2023-09-29 16:15:20,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:15:21,226 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:15:22,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:23,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:24,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:15:25,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:15:28,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 16:15:30,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:15:31,785 INFO [train.py:1039] (0/4) Epoch 12, batch 4500, loss[loss=0.267, simple_loss=0.3179, pruned_loss=0.1081, over 19938.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2689, pruned_loss=0.06241, over 4736551.43 frames. ], batch size: 388, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:15:35,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:37,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 16:15:37,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 16:15:38,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:15:40,285 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.947e+02 2.224e+02 2.499e+02 3.956e+02, threshold=4.448e+02, percent-clipped=0.0 2023-09-29 16:15:44,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:44,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:45,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:15:47,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:15:47,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:15:47,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:15:59,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=419620.0, ans=0.125 2023-09-29 16:16:00,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:16:00,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:16:03,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:04,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:16:05,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:16:13,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:16:18,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:16:18,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=419686.6666666667, ans=0.125 2023-09-29 16:16:23,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:16:27,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:16:27,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 16:16:27,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:27,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:33,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:16:34,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 16:16:34,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:16:34,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:39,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:16:39,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:16:43,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:45,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:16:45,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=419820.0, ans=0.125 2023-09-29 16:16:46,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:16:48,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 16:16:48,449 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=419820.0, ans=0.0 2023-09-29 16:16:50,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 16:16:50,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 16:16:54,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 16:16:56,292 INFO [train.py:1039] (0/4) Epoch 12, batch 4550, loss[loss=0.2117, simple_loss=0.2648, pruned_loss=0.0793, over 23331.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.268, pruned_loss=0.06259, over 4722899.98 frames. ], batch size: 285, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:16:59,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 16:16:59,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:02,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:04,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:07,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:11,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:17:13,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:17:15,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:15,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:17:15,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:18,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:18,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:19,487 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=10.54 vs. limit=10.0 2023-09-29 16:17:22,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:17:24,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 16:17:26,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 16:17:26,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:17:28,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 16:17:30,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=420020.0, ans=22.5 2023-09-29 16:17:33,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 16:17:34,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:36,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 16:17:37,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:17:42,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:17:46,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 16:17:46,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:17:46,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=420086.6666666667, ans=0.05 2023-09-29 16:17:49,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:49,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:51,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=420086.6666666667, ans=0.2 2023-09-29 16:17:52,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:53,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 16:17:55,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 16:17:55,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:17:57,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 16:17:57,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 16:17:57,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:59,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:59,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:01,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:01,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:18:04,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:18:04,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 16:18:05,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=420153.3333333333, ans=0.0 2023-09-29 16:18:06,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:18:06,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:18:07,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 16:18:07,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:18:07,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 16:18:10,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:18:10,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:18:14,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:18:14,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:14,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:18:17,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:18:18,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:18:19,362 INFO [train.py:1039] (0/4) Epoch 12, batch 4600, loss[loss=0.1867, simple_loss=0.2456, pruned_loss=0.06385, over 22783.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2676, pruned_loss=0.06248, over 4726316.08 frames. ], batch size: 322, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:18:22,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:23,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:25,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:18:25,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:18:26,889 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.954e+02 2.198e+02 2.471e+02 4.636e+02, threshold=4.396e+02, percent-clipped=1.0 2023-09-29 16:18:27,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:27,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 16:18:28,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:18:36,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:18:36,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:39,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:44,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=420286.6666666667, ans=0.0 2023-09-29 16:18:47,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 16:18:47,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:50,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:52,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:18:52,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:57,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 16:18:57,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:18:58,322 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.06 vs. limit=15.0 2023-09-29 16:18:59,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:19:04,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:04,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:19:06,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:19:11,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 16:19:13,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:19:16,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=420420.0, ans=0.125 2023-09-29 16:19:18,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:19,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:19:22,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:22,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 16:19:22,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:24,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 16:19:24,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:24,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:26,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:26,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:19:27,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:29,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 16:19:29,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=420486.6666666667, ans=0.125 2023-09-29 16:19:30,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 16:19:30,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 16:19:30,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:32,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:32,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:34,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:43,608 INFO [train.py:1039] (0/4) Epoch 12, batch 4650, loss[loss=0.1984, simple_loss=0.264, pruned_loss=0.06643, over 23809.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2671, pruned_loss=0.06207, over 4735322.74 frames. ], batch size: 164, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:19:45,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:19:46,862 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=420553.3333333333, ans=0.2 2023-09-29 16:19:49,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:19:51,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:51,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:19:51,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:52,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:54,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:57,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 16:20:01,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:20:03,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 16:20:03,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:20:03,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 16:20:05,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:20:05,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 16:20:05,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 16:20:05,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:07,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:20:10,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:20:11,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:11,751 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 16:20:16,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:17,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 16:20:19,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=420686.6666666667, ans=0.2 2023-09-29 16:20:20,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:22,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:20:22,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 16:20:24,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:20:28,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:20:30,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:35,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:39,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:20:44,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 16:20:44,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 16:20:44,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 16:20:44,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 16:20:46,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:20:55,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:20:55,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:20:55,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 16:20:55,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:58,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:20:58,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:21:00,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:21:03,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:21:03,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:21:03,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:21:04,938 INFO [train.py:1039] (0/4) Epoch 12, batch 4700, loss[loss=0.2062, simple_loss=0.2838, pruned_loss=0.06429, over 24032.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.268, pruned_loss=0.06189, over 4742006.98 frames. ], batch size: 86, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:21:09,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:09,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:21:10,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:21:11,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 16:21:11,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:21:12,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 16:21:14,008 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.872e+02 2.064e+02 2.331e+02 3.087e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-29 16:21:14,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=420886.6666666667, ans=0.1 2023-09-29 16:21:16,206 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=420886.6666666667, ans=0.0 2023-09-29 16:21:20,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:21,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:22,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:21:22,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:24,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:21:29,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 16:21:29,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 16:21:33,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:33,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:21:34,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:21:37,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:45,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:21:45,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:21:48,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:49,601 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.31 vs. limit=15.0 2023-09-29 16:21:55,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 16:21:57,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:22:00,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:04,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 16:22:04,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:22:09,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:22:09,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 16:22:12,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:12,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:14,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:22:15,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:22:15,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 16:22:17,100 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 16:22:18,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:20,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 16:22:21,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:25,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 16:22:27,029 INFO [train.py:1039] (0/4) Epoch 12, batch 4750, loss[loss=0.1719, simple_loss=0.2486, pruned_loss=0.04756, over 24326.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2684, pruned_loss=0.06254, over 4729373.25 frames. ], batch size: 61, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:22:28,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:22:30,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:22:36,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 16:22:37,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:22:41,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 16:22:42,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:22:44,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:44,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:22:49,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 16:22:49,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=421286.6666666667, ans=0.0 2023-09-29 16:22:54,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:22:55,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 16:22:55,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:23:02,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:03,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=421353.3333333333, ans=0.125 2023-09-29 16:23:04,428 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 16:23:04,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 16:23:11,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 16:23:14,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:14,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:16,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:23:16,608 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 16:23:16,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:19,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:23:23,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:23:24,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 16:23:25,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 16:23:26,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:27,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:23:27,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:28,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:23:29,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 16:23:33,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 16:23:36,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:23:39,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:23:40,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 16:23:40,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:23:42,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:43,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:23:45,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:45,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:23:48,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=421553.3333333333, ans=0.125 2023-09-29 16:23:50,779 INFO [train.py:1039] (0/4) Epoch 12, batch 4800, loss[loss=0.1809, simple_loss=0.2518, pruned_loss=0.05502, over 24258.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2686, pruned_loss=0.06237, over 4742197.39 frames. ], batch size: 56, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:23:50,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:50,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 16:23:52,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 16:23:52,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 16:23:55,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:23:55,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:57,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 16:23:59,991 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.054e+02 2.346e+02 2.832e+02 5.942e+02, threshold=4.692e+02, percent-clipped=5.0 2023-09-29 16:24:03,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:04,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:06,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=421620.0, ans=0.125 2023-09-29 16:24:07,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:24:08,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:09,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:09,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 16:24:09,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:24:11,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:24:11,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:24:17,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:18,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:20,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:24:20,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:21,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:24:21,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:21,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=421686.6666666667, ans=0.0 2023-09-29 16:24:23,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:25,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:28,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:29,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:29,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:24:31,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:24:34,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:36,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 16:24:36,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 16:24:36,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:36,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:24:37,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:24:37,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:37,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:24:39,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:24:39,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:42,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:24:45,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=421753.3333333333, ans=0.0 2023-09-29 16:24:46,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:48,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:24:53,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 16:24:53,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:53,985 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.41 vs. limit=15.0 2023-09-29 16:24:55,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:55,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:24:56,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:25:01,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:25:01,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:25:01,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:01,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:25:01,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=421820.0, ans=0.0 2023-09-29 16:25:02,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:25:02,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:25:07,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:08,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:08,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:25:10,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 16:25:11,820 INFO [train.py:1039] (0/4) Epoch 12, batch 4850, loss[loss=0.1912, simple_loss=0.2777, pruned_loss=0.05233, over 24423.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2693, pruned_loss=0.06274, over 4732493.59 frames. ], batch size: 69, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:25:12,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 16:25:12,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:12,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:13,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:13,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:16,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:25:25,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 16:25:27,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:33,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:33,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:25:34,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:37,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:39,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:25:39,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=421953.3333333333, ans=0.125 2023-09-29 16:25:40,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:25:42,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 16:25:45,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:48,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:25:48,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:25:48,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:25:48,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 16:25:50,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:50,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 16:25:55,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 16:25:56,003 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.53 vs. limit=6.0 2023-09-29 16:25:57,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:26:06,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:26:07,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 16:26:09,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:26:09,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:26:12,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:26:12,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 16:26:12,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:13,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 16:26:13,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:15,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:16,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 16:26:23,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=422153.3333333333, ans=0.125 2023-09-29 16:26:24,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:30,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:26:32,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:26:34,906 INFO [train.py:1039] (0/4) Epoch 12, batch 4900, loss[loss=0.1787, simple_loss=0.2361, pruned_loss=0.06064, over 23426.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2679, pruned_loss=0.06299, over 4722445.45 frames. ], batch size: 285, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:26:36,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=422220.0, ans=0.0 2023-09-29 16:26:38,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 16:26:38,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:26:43,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:43,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:44,648 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 2.050e+02 2.285e+02 2.620e+02 3.714e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 16:26:44,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:26:48,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 16:26:52,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 16:26:57,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 16:26:57,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=422286.6666666667, ans=0.125 2023-09-29 16:26:58,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 16:26:58,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:26:59,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:27:00,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:00,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:00,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:27:00,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 16:27:05,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 16:27:06,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:27:06,788 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=7.16 vs. limit=12.0 2023-09-29 16:27:08,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:27:08,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:27:12,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:27:12,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:13,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:13,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 16:27:15,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:27:16,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:18,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 16:27:18,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 16:27:21,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 16:27:22,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:27:24,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:27:24,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:27:25,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:25,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:27:25,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:27:25,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 16:27:28,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:31,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:27:33,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:27:34,266 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=422420.0, ans=15.0 2023-09-29 16:27:36,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 16:27:38,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:27:39,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:27:39,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 16:27:45,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=422486.6666666667, ans=0.0 2023-09-29 16:27:46,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:48,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:27:49,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 16:27:49,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:27:49,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:27:49,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:54,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:54,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:27:54,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:54,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 16:27:57,483 INFO [train.py:1039] (0/4) Epoch 12, batch 4950, loss[loss=0.1623, simple_loss=0.2123, pruned_loss=0.0562, over 19075.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.2669, pruned_loss=0.06293, over 4709472.11 frames. ], batch size: 389, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:27:57,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:27:58,492 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.13 vs. limit=10.0 2023-09-29 16:27:59,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:00,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:28:03,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 16:28:03,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 16:28:05,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:28:06,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 16:28:06,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:06,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:28:06,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:28:06,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:08,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:10,517 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.21 vs. limit=12.0 2023-09-29 16:28:11,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:28:12,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:28:12,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:15,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:15,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:28:19,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:28:25,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:27,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:28:30,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:30,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:32,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:28:33,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 16:28:35,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 16:28:35,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=422686.6666666667, ans=0.125 2023-09-29 16:28:38,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:41,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:28:41,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:28:43,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:28:43,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:28:43,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:28:45,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:46,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:28:47,041 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=422753.3333333333, ans=0.125 2023-09-29 16:28:50,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:28:53,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:53,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:54,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 16:28:54,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:28:56,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:28:59,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:00,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:29:00,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:29:00,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:02,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:29:02,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:29:02,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=422820.0, ans=0.1 2023-09-29 16:29:05,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:29:05,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:29:06,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:29:08,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 16:29:11,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:16,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 16:29:18,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:29:20,410 INFO [train.py:1039] (0/4) Epoch 12, batch 5000, loss[loss=0.2064, simple_loss=0.2656, pruned_loss=0.07361, over 22909.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2662, pruned_loss=0.06264, over 4714038.18 frames. ], batch size: 322, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:29:26,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:26,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:27,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 16:29:28,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 16:29:31,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:29:31,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=422886.6666666667, ans=0.1 2023-09-29 16:29:32,531 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.922e+02 2.238e+02 2.801e+02 3.922e+02, threshold=4.477e+02, percent-clipped=0.0 2023-09-29 16:29:32,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 16:29:32,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:29:32,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:29:34,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 16:29:35,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:35,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:29:37,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 16:29:37,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:37,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:29:39,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 16:29:40,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 16:29:40,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:29:40,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 16:29:40,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:29:42,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:42,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:29:42,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 16:29:42,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 16:29:43,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 16:29:45,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:45,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:46,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 16:29:46,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:50,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:51,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:54,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:29:57,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 16:29:57,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:57,520 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=423020.0, ans=0.1 2023-09-29 16:29:59,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:30:03,817 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 16:30:04,029 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=423020.0, ans=0.2 2023-09-29 16:30:05,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=423020.0, ans=0.0 2023-09-29 16:30:06,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:30:08,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:30:08,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:11,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 16:30:11,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:30:12,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:13,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:14,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 16:30:14,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:25,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 16:30:31,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:32,274 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.45 vs. limit=10.0 2023-09-29 16:30:40,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:41,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:41,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:30:41,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:41,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:30:43,040 INFO [train.py:1039] (0/4) Epoch 12, batch 5050, loss[loss=0.214, simple_loss=0.282, pruned_loss=0.07299, over 23233.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2671, pruned_loss=0.0633, over 4702490.87 frames. ], batch size: 105, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:30:43,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:30:43,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 16:30:49,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:30:51,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:52,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:30:54,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 16:30:54,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:55,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:57,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:30:58,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:30:58,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:31:09,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 16:31:09,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:31:11,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:11,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 16:31:12,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:14,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:15,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:31:17,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:31:17,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 16:31:18,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 16:31:20,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:20,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=423353.3333333333, ans=0.125 2023-09-29 16:31:21,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:23,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:24,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 16:31:26,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:29,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 16:31:32,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:31:32,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:31:34,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:35,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:36,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:31:39,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:31:39,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:39,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:31:41,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:31:41,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 16:31:41,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:31:43,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:46,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:46,199 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 16:31:46,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:31:47,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:31:49,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:49,305 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 16:31:52,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:52,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 16:31:52,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:56,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:56,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:58,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 16:31:58,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 16:32:01,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:01,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:01,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:32:03,311 INFO [train.py:1039] (0/4) Epoch 12, batch 5100, loss[loss=0.188, simple_loss=0.2638, pruned_loss=0.0561, over 24340.00 frames. ], tot_loss[loss=0.1976, simple_loss=0.2681, pruned_loss=0.06353, over 4703347.30 frames. ], batch size: 61, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:32:04,946 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 16:32:06,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:32:10,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 16:32:10,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 16:32:12,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:15,491 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.923e+02 2.120e+02 2.504e+02 4.528e+02, threshold=4.241e+02, percent-clipped=1.0 2023-09-29 16:32:15,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:32:18,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:32:20,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 16:32:20,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 16:32:25,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:32:25,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:32:28,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:30,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 16:32:31,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:33,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:32:33,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:32:36,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:37,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:37,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 16:32:39,279 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 16:32:41,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:42,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 16:32:42,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 16:32:43,460 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.14 vs. limit=15.0 2023-09-29 16:32:47,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:50,305 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=423686.6666666667, ans=0.0 2023-09-29 16:32:56,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:32:58,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 16:32:58,501 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 16:32:58,524 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 16:33:00,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 16:33:00,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:33:03,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 16:33:07,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 16:33:10,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:33:10,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=423820.0, ans=0.125 2023-09-29 16:33:12,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:33:16,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 16:33:18,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:33:18,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 16:33:24,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:33:24,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:33:24,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:33:26,690 INFO [train.py:1039] (0/4) Epoch 12, batch 5150, loss[loss=0.2086, simple_loss=0.2702, pruned_loss=0.07351, over 23764.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2694, pruned_loss=0.06373, over 4714878.41 frames. ], batch size: 164, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:33:26,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:33:26,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:33:28,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:33:29,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 16:33:29,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 16:33:31,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 16:33:31,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:33:31,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 16:33:32,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:32,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:33:34,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:36,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:40,113 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-09-29 16:33:41,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:33:41,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 16:33:42,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:44,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:33:45,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:33:45,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:33:45,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:33:45,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:33:45,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:33:46,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=423953.3333333333, ans=0.1 2023-09-29 16:33:47,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 16:33:49,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:33:49,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:33:53,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:33:55,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 16:33:55,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:33:58,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=424020.0, ans=0.1 2023-09-29 16:34:03,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:34:05,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 16:34:08,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:15,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:17,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:19,702 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.89 vs. limit=15.0 2023-09-29 16:34:22,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:22,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:23,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 16:34:28,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:34:28,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:34:30,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:34:33,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:35,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:35,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 16:34:43,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:43,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:34:45,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:45,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:34:46,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:34:46,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:34:46,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:34:47,067 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=424220.0, ans=0.0 2023-09-29 16:34:48,046 INFO [train.py:1039] (0/4) Epoch 12, batch 5200, loss[loss=0.1905, simple_loss=0.274, pruned_loss=0.0535, over 23479.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2692, pruned_loss=0.06377, over 4715212.52 frames. ], batch size: 105, lr: 8.48e-03, grad_scale: 16.0 2023-09-29 16:34:48,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:34:49,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=424220.0, ans=0.2 2023-09-29 16:34:51,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:34:54,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:34:55,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:58,797 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.937e+02 2.192e+02 2.501e+02 3.290e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 16:34:59,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 16:35:00,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:35:01,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:05,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:05,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:35:05,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:07,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 16:35:11,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:35:11,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=424286.6666666667, ans=0.0 2023-09-29 16:35:12,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:15,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 16:35:18,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:35:18,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:35:20,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 16:35:20,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 16:35:20,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=424353.3333333333, ans=0.125 2023-09-29 16:35:23,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 16:35:23,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:23,633 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 16:35:23,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:25,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:25,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:35:26,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 16:35:26,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:35:29,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:32,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 16:35:32,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 16:35:34,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 16:35:38,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 16:35:40,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:35:45,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:35:45,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:35:48,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 16:35:48,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:48,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 16:35:48,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:50,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:35:54,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:35:56,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:35:57,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:59,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:35:59,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:05,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:06,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 16:36:07,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:36:07,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:36:08,358 INFO [train.py:1039] (0/4) Epoch 12, batch 5250, loss[loss=0.2068, simple_loss=0.2865, pruned_loss=0.06352, over 24622.00 frames. ], tot_loss[loss=0.1983, simple_loss=0.269, pruned_loss=0.06384, over 4710247.20 frames. ], batch size: 68, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:36:08,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:09,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:36:11,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:36:15,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:36:17,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:17,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:36:18,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:36:22,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=424553.3333333333, ans=0.125 2023-09-29 16:36:25,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:26,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:36:26,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:36:28,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:36:28,936 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.98 vs. limit=15.0 2023-09-29 16:36:31,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 16:36:31,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:32,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:55,308 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=424753.3333333333, ans=0.0 2023-09-29 16:36:56,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=424753.3333333333, ans=0.0 2023-09-29 16:36:57,244 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.95 vs. limit=15.0 2023-09-29 16:37:07,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=424753.3333333333, ans=0.2 2023-09-29 16:37:11,386 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=424820.0, ans=0.1 2023-09-29 16:37:15,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=424820.0, ans=0.05 2023-09-29 16:37:23,496 INFO [train.py:1039] (0/4) Epoch 12, batch 5300, loss[loss=0.2047, simple_loss=0.2638, pruned_loss=0.07282, over 23842.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.2673, pruned_loss=0.06351, over 4706784.59 frames. ], batch size: 212, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:37:24,204 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.09 vs. limit=15.0 2023-09-29 16:37:33,065 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.874e+02 2.092e+02 2.441e+02 3.524e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 16:37:38,111 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-12.pt 2023-09-29 16:37:45,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:37:45,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 16:37:45,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 16:37:45,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:45,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:46,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:46,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:46,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:46,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:37:46,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:46,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:37:47,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:37:47,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 16:37:47,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 16:37:47,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 16:37:47,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:37:47,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 16:37:47,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 16:37:48,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:48,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:48,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:48,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:48,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:37:49,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:49,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:49,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:49,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:49,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:49,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:37:49,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:49,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:37:51,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 16:37:51,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:51,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:51,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 16:37:51,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 16:37:52,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:37:52,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:37:52,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 16:37:52,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 16:37:52,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:53,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:37:53,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:53,474 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 16:37:53,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 16:37:53,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:37:53,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:53,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 16:37:53,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 16:37:54,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 16:37:54,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:57,152 INFO [train.py:1039] (0/4) Epoch 13, batch 0, loss[loss=0.2071, simple_loss=0.2864, pruned_loss=0.0639, over 24006.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2864, pruned_loss=0.0639, over 24006.00 frames. ], batch size: 80, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:37:57,153 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 16:38:10,941 INFO [train.py:1071] (0/4) Epoch 13, validation: loss=0.2695, simple_loss=0.2756, pruned_loss=0.1317, over 1125622.00 frames. 2023-09-29 16:38:10,942 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20655MB 2023-09-29 16:38:12,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 16:38:13,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=424966.6666666667, ans=0.125 2023-09-29 16:38:13,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=424966.6666666667, ans=0.125 2023-09-29 16:38:14,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:38:15,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:38:17,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=424966.6666666667, ans=0.125 2023-09-29 16:38:20,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:20,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:38:22,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:22,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 16:38:24,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 16:38:24,928 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.17 vs. limit=12.0 2023-09-29 16:38:27,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:28,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:32,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=425033.3333333333, ans=0.125 2023-09-29 16:38:33,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:33,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:35,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:38:35,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:36,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 16:38:38,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:41,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=425033.3333333333, ans=0.2 2023-09-29 16:38:45,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:38:45,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:48,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 16:38:52,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:38:52,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:38:52,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=425100.0, ans=0.125 2023-09-29 16:38:53,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:38:54,315 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=425100.0, ans=0.1 2023-09-29 16:38:59,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:39:04,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:07,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=425166.6666666667, ans=0.125 2023-09-29 16:39:11,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 16:39:12,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=425166.6666666667, ans=0.0 2023-09-29 16:39:14,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 16:39:14,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:14,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:15,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:39:16,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:39:18,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 16:39:22,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:24,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:27,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:39:30,525 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 16:39:32,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:39:34,204 INFO [train.py:1039] (0/4) Epoch 13, batch 50, loss[loss=0.1571, simple_loss=0.2339, pruned_loss=0.04017, over 24259.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2675, pruned_loss=0.06127, over 1069789.72 frames. ], batch size: 56, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:39:35,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:37,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:37,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 16:39:37,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=425300.0, ans=0.0 2023-09-29 16:39:39,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:39:39,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:39:44,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:46,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:48,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:48,959 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.27 vs. limit=15.0 2023-09-29 16:39:51,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 16:39:51,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:58,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:40:00,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 16:40:00,684 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:40:02,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 16:40:04,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:40:06,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:06,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:06,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:08,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:40:08,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=425433.3333333333, ans=0.0 2023-09-29 16:40:10,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:40:10,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:10,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=425433.3333333333, ans=0.035 2023-09-29 16:40:17,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:18,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:18,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:40:20,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 16:40:23,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:40:24,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:40:24,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 16:40:26,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:26,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=425500.0, ans=0.0 2023-09-29 16:40:28,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 16:40:35,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:40:35,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:38,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:40,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:40,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:42,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=425566.6666666667, ans=0.125 2023-09-29 16:40:43,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 16:40:43,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 16:40:45,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:46,809 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.884e+02 2.162e+02 2.621e+02 5.674e+02, threshold=4.324e+02, percent-clipped=3.0 2023-09-29 16:40:46,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:48,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:48,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:48,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 16:40:50,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 16:40:50,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:40:52,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:52,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:40:54,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 16:40:54,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 16:40:54,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:55,945 INFO [train.py:1039] (0/4) Epoch 13, batch 100, loss[loss=0.2088, simple_loss=0.2887, pruned_loss=0.06441, over 24060.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.269, pruned_loss=0.06098, over 1883609.82 frames. ], batch size: 80, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:40:56,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:58,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:40:58,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:41:01,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:41:01,712 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=425633.3333333333, ans=0.125 2023-09-29 16:41:04,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:41:06,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=425633.3333333333, ans=0.125 2023-09-29 16:41:07,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:08,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 16:41:08,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:41:12,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=425700.0, ans=0.0 2023-09-29 16:41:13,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:41:13,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:13,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:41:13,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:41:13,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:15,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 16:41:15,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:41:15,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:17,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:17,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:20,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 16:41:22,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:24,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:25,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:41:26,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=425700.0, ans=0.2 2023-09-29 16:41:29,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:41:31,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=425766.6666666667, ans=0.125 2023-09-29 16:41:33,364 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 16:41:33,389 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 16:41:34,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:41:34,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:41:39,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:41:42,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:42,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:48,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:49,755 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 16:41:50,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=425833.3333333333, ans=0.1 2023-09-29 16:41:51,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:41:53,915 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.76 vs. limit=15.0 2023-09-29 16:41:56,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:41:57,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:41:58,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=425833.3333333333, ans=0.2 2023-09-29 16:41:59,778 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=425900.0, ans=0.1 2023-09-29 16:42:01,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:04,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:06,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:08,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:42:10,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:10,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=425900.0, ans=0.125 2023-09-29 16:42:12,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:14,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:14,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:42:14,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:16,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 16:42:16,374 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 16:42:16,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=425966.6666666667, ans=0.125 2023-09-29 16:42:17,684 INFO [train.py:1039] (0/4) Epoch 13, batch 150, loss[loss=0.1823, simple_loss=0.2572, pruned_loss=0.05374, over 24590.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2685, pruned_loss=0.06168, over 2519013.01 frames. ], batch size: 60, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:42:17,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:18,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=425966.6666666667, ans=0.1 2023-09-29 16:42:19,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:42:19,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:19,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:19,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:42:19,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:42:19,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:42:19,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:20,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:22,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:24,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:42:24,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:42:26,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=425966.6666666667, ans=0.125 2023-09-29 16:42:27,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:29,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:29,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:42:31,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:32,950 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=426033.3333333333, ans=0.125 2023-09-29 16:42:34,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:34,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:36,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=426033.3333333333, ans=0.125 2023-09-29 16:42:37,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:42:38,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:42,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 16:42:42,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 16:42:42,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 16:42:46,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:42:46,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:42:48,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:42:48,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:49,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:49,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:51,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:52,868 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 16:42:54,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:59,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:02,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:43:04,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 16:43:06,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=426166.6666666667, ans=0.1 2023-09-29 16:43:09,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:43:09,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:09,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:11,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:43:12,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:43:14,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:43:15,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:15,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 16:43:23,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:24,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:24,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:43:24,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:43:27,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:27,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 16:43:28,668 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.32 vs. limit=15.0 2023-09-29 16:43:29,955 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.22 vs. limit=12.0 2023-09-29 16:43:30,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:43:32,026 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.958e+02 2.151e+02 2.617e+02 4.145e+02, threshold=4.302e+02, percent-clipped=0.0 2023-09-29 16:43:32,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:43:33,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:35,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:43:35,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 16:43:35,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:37,444 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 16:43:40,263 INFO [train.py:1039] (0/4) Epoch 13, batch 200, loss[loss=0.1918, simple_loss=0.2793, pruned_loss=0.05216, over 24286.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2679, pruned_loss=0.06175, over 3007825.19 frames. ], batch size: 74, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:43:42,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:43:45,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:43:45,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:43:49,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 16:43:51,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:51,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:53,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 16:43:55,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:43:56,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:58,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:59,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=426366.6666666667, ans=0.125 2023-09-29 16:44:02,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=426366.6666666667, ans=0.125 2023-09-29 16:44:03,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:44:03,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:44:05,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:09,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=426366.6666666667, ans=0.125 2023-09-29 16:44:20,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=426433.3333333333, ans=0.5 2023-09-29 16:44:23,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:44:23,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=426433.3333333333, ans=0.125 2023-09-29 16:44:24,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:44:24,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:44:26,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:44:26,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 16:44:26,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:44:28,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:28,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:44:28,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:28,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:44:32,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 16:44:32,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:44:32,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:39,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:44:40,912 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.54 vs. limit=15.0 2023-09-29 16:44:44,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:51,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:51,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:44:58,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:59,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 16:45:01,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:01,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:45:01,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:03,244 INFO [train.py:1039] (0/4) Epoch 13, batch 250, loss[loss=0.2205, simple_loss=0.282, pruned_loss=0.07947, over 23892.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.269, pruned_loss=0.06269, over 3377348.96 frames. ], batch size: 195, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:45:03,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:45:04,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 16:45:05,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:45:05,694 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 16:45:08,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:10,385 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-64000.pt 2023-09-29 16:45:13,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:45:13,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:15,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:17,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:45:17,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:18,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:45:23,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:45:29,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=426700.0, ans=0.125 2023-09-29 16:45:30,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=426700.0, ans=0.125 2023-09-29 16:45:35,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:36,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:36,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:45:42,092 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=426766.6666666667, ans=0.0 2023-09-29 16:45:45,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:45:46,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:45:48,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:45:48,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:48,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:45:48,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:45:50,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:53,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:45:56,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 16:45:56,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:57,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:45:57,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:45:57,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:45:59,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:01,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:46:01,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:46:04,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:06,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:46:06,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:07,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=426833.3333333333, ans=0.0 2023-09-29 16:46:09,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:46:12,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=426900.0, ans=0.125 2023-09-29 16:46:12,703 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=16.27 vs. limit=15.0 2023-09-29 16:46:13,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:15,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:46:15,944 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.87 vs. limit=15.0 2023-09-29 16:46:17,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=426900.0, ans=0.2 2023-09-29 16:46:20,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:21,733 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.073e+02 2.477e+02 4.320e+02, threshold=4.145e+02, percent-clipped=1.0 2023-09-29 16:46:22,139 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=426900.0, ans=0.125 2023-09-29 16:46:22,894 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.78 vs. limit=15.0 2023-09-29 16:46:23,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:46:23,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=426900.0, ans=0.2 2023-09-29 16:46:26,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 16:46:28,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:46:29,457 INFO [train.py:1039] (0/4) Epoch 13, batch 300, loss[loss=0.1873, simple_loss=0.2718, pruned_loss=0.05136, over 24685.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2665, pruned_loss=0.0617, over 3666295.42 frames. ], batch size: 73, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:46:29,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:31,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 16:46:32,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:46:32,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:46:32,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 16:46:33,532 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.15 vs. limit=6.0 2023-09-29 16:46:39,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:40,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:46:43,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:46:43,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 16:46:45,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:45,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:46:45,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 16:46:45,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:46:50,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:46:50,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=427033.3333333333, ans=0.0 2023-09-29 16:46:52,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=427033.3333333333, ans=0.125 2023-09-29 16:46:55,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:46:56,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 16:47:01,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 16:47:01,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:02,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:05,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:05,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 16:47:05,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:47:09,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:47:12,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:47:12,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:17,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:47:18,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 16:47:18,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:47:22,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:23,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 16:47:25,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:29,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:47:33,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:47:33,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 16:47:37,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:37,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:47:40,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:42,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:47:42,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 16:47:42,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:47:43,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:45,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 16:47:46,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:47,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:48,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:48,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:49,492 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.66 vs. limit=15.0 2023-09-29 16:47:50,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:51,855 INFO [train.py:1039] (0/4) Epoch 13, batch 350, loss[loss=0.199, simple_loss=0.2835, pruned_loss=0.05722, over 24291.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2654, pruned_loss=0.06147, over 3890980.97 frames. ], batch size: 74, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:47:55,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:47:55,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:47:58,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:03,485 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=427300.0, ans=0.1 2023-09-29 16:48:05,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:48:07,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:07,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:10,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 16:48:12,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:12,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 16:48:15,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:15,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 16:48:15,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:18,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 16:48:20,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:48:22,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:23,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:48:24,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=427433.3333333333, ans=0.04949747468305833 2023-09-29 16:48:25,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,052 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.31 vs. limit=22.5 2023-09-29 16:48:26,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:48:26,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:27,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:48:28,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:48:28,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:36,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:48:36,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:48:37,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:48:37,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:41,183 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:48:44,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 16:48:45,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:49,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:49,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:48:50,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:50,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 16:48:54,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:48:54,744 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 16:48:57,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 16:48:57,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:00,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:49:00,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 16:49:02,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:04,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:49:05,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:07,240 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.930e+02 2.101e+02 2.393e+02 3.670e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-29 16:49:07,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:07,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:10,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:11,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=427566.6666666667, ans=0.0 2023-09-29 16:49:13,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:49:15,502 INFO [train.py:1039] (0/4) Epoch 13, batch 400, loss[loss=0.1706, simple_loss=0.2456, pruned_loss=0.04778, over 23100.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2648, pruned_loss=0.06121, over 4059643.45 frames. ], batch size: 51, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:49:15,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:49:16,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=427633.3333333333, ans=0.2 2023-09-29 16:49:17,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 16:49:17,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:17,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:19,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:49:20,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:21,694 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.69 vs. limit=15.0 2023-09-29 16:49:24,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:26,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:28,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 16:49:29,310 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:49:30,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 16:49:30,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:31,049 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.27 vs. limit=22.5 2023-09-29 16:49:32,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 16:49:32,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:38,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:49:38,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:38,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 16:49:38,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:49:38,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:38,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:40,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:43,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 16:49:43,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 16:49:48,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:50,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:52,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 16:49:52,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 16:49:52,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:55,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:49:57,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:57,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:50:00,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:04,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=427833.3333333333, ans=0.0 2023-09-29 16:50:07,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 16:50:08,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:50:11,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 16:50:13,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:50:15,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:50:15,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 16:50:18,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:50:21,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:50:23,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:50:25,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:27,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 16:50:28,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:50:30,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 16:50:31,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:50:31,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:50:32,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=427900.0, ans=0.0 2023-09-29 16:50:33,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 16:50:36,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:50:37,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:50:38,459 INFO [train.py:1039] (0/4) Epoch 13, batch 450, loss[loss=0.2011, simple_loss=0.2707, pruned_loss=0.06573, over 23397.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2653, pruned_loss=0.0611, over 4208602.20 frames. ], batch size: 93, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:50:38,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:50:40,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 16:50:40,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:50:40,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:50:41,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:50:41,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=427966.6666666667, ans=0.1 2023-09-29 16:50:43,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 16:50:43,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:50:43,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:50:46,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:50:55,828 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=428033.3333333333, ans=0.2 2023-09-29 16:50:57,153 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=428033.3333333333, ans=0.04949747468305833 2023-09-29 16:50:57,187 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=428033.3333333333, ans=0.0 2023-09-29 16:50:58,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:58,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:00,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 16:51:01,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 16:51:03,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:51:08,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:09,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:13,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:13,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:18,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 16:51:18,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 16:51:21,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 16:51:21,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:51:22,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:22,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:51:25,114 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 16:51:25,137 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 16:51:26,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:26,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:51:28,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:51:31,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:51:31,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:51:31,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 16:51:33,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 16:51:36,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:39,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:51:40,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:51:41,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 16:51:45,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:51:46,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 16:51:48,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 16:51:49,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:50,599 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.75 vs. limit=15.0 2023-09-29 16:51:52,576 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.942e+02 2.289e+02 2.754e+02 3.873e+02, threshold=4.578e+02, percent-clipped=0.0 2023-09-29 16:51:53,848 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.52 vs. limit=5.0 2023-09-29 16:51:55,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:51:58,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:00,615 INFO [train.py:1039] (0/4) Epoch 13, batch 500, loss[loss=0.1776, simple_loss=0.2454, pruned_loss=0.0549, over 14541.00 frames. ], tot_loss[loss=0.1953, simple_loss=0.2665, pruned_loss=0.06206, over 4313903.31 frames. ], batch size: 31, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:52:00,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:52:00,772 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 16:52:03,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:05,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:52:06,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:06,107 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 16:52:07,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 16:52:07,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:10,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:52:16,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:52:17,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:52:19,855 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.41 vs. limit=15.0 2023-09-29 16:52:20,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:20,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:20,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:33,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:33,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:52:34,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:52:34,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:34,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 16:52:34,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:52:37,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=428433.3333333333, ans=0.2 2023-09-29 16:52:38,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:52:39,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:52:39,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:52:39,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:40,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 16:52:43,552 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 16:52:49,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:52:50,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:53,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:52:53,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=428500.0, ans=0.2 2023-09-29 16:52:54,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 16:52:58,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:53:00,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:03,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:06,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:53:12,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:14,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 16:53:14,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:14,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:18,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 16:53:19,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:53:20,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:23,478 INFO [train.py:1039] (0/4) Epoch 13, batch 550, loss[loss=0.1876, simple_loss=0.2686, pruned_loss=0.05328, over 24531.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2678, pruned_loss=0.06217, over 4416926.70 frames. ], batch size: 66, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:53:26,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 16:53:28,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 16:53:28,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=428633.3333333333, ans=0.1 2023-09-29 16:53:30,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:30,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 16:53:30,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=428633.3333333333, ans=0.125 2023-09-29 16:53:31,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:53:31,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:31,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:53:33,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:53:36,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:37,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 16:53:39,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:53:41,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=428700.0, ans=0.125 2023-09-29 16:53:43,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:53:43,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:47,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:53:47,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:52,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 16:53:53,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=428700.0, ans=0.125 2023-09-29 16:53:54,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 16:53:54,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:54:03,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:54:03,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:03,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:54:07,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:07,951 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 16:54:08,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:54:09,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 16:54:12,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:12,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:54:12,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:54:13,348 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=7.55 vs. limit=12.0 2023-09-29 16:54:14,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:15,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 16:54:15,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 16:54:17,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:19,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:54:19,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:54:19,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:54:20,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:54:22,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:54:25,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:54:26,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:26,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:54:27,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:54:29,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:31,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:54:32,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:34,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:54:34,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:54:39,421 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.968e+02 2.209e+02 2.597e+02 3.344e+02, threshold=4.418e+02, percent-clipped=0.0 2023-09-29 16:54:39,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 16:54:40,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=428900.0, ans=0.125 2023-09-29 16:54:44,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 16:54:44,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:54:45,769 INFO [train.py:1039] (0/4) Epoch 13, batch 600, loss[loss=0.1926, simple_loss=0.2473, pruned_loss=0.0689, over 22798.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2678, pruned_loss=0.0619, over 4474666.45 frames. ], batch size: 322, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:54:45,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:54:45,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:48,396 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=428966.6666666667, ans=0.0 2023-09-29 16:54:48,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=428966.6666666667, ans=0.0 2023-09-29 16:54:52,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:54:52,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:54:54,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 16:54:56,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:54:56,604 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:54:59,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:02,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:05,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 16:55:06,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:55:14,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 16:55:14,615 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:55:16,222 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=429033.3333333333, ans=0.125 2023-09-29 16:55:18,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:55:18,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:18,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:55:25,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:55:25,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:55:27,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:33,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:55:39,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:39,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:39,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:39,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=429166.6666666667, ans=0.1 2023-09-29 16:55:48,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 16:55:49,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=429166.6666666667, ans=0.2 2023-09-29 16:55:54,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:55:54,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:55:58,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 16:55:58,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:56:01,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 16:56:03,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:56:03,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:56:05,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=429233.3333333333, ans=0.0 2023-09-29 16:56:08,411 INFO [train.py:1039] (0/4) Epoch 13, batch 650, loss[loss=0.1713, simple_loss=0.2479, pruned_loss=0.04739, over 24337.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2659, pruned_loss=0.06149, over 4515174.10 frames. ], batch size: 56, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:56:08,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:56:08,966 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=429300.0, ans=0.0 2023-09-29 16:56:11,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:56:13,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:56:13,941 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=429300.0, ans=0.0 2023-09-29 16:56:15,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:56:17,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:18,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=429300.0, ans=0.125 2023-09-29 16:56:20,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 16:56:20,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:56:28,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:56:28,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:30,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:34,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 16:56:34,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:56:36,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:37,019 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=429366.6666666667, ans=0.1 2023-09-29 16:56:40,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:56:40,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 16:56:45,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:45,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:45,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:56:46,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:48,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:56:49,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:56:50,025 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 16:56:50,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:50,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:56:55,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:56,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:56:56,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:56:58,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:56:58,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 16:56:59,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:56:59,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:57:01,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:57:01,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:57:02,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:57:03,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 16:57:04,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 16:57:04,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:04,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:57:04,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:57:06,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:57:07,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:57:14,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:14,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:57:18,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:57:19,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:21,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 16:57:21,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:24,841 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.054e+02 2.276e+02 2.735e+02 4.255e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 16:57:28,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:57:28,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:28,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:29,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:31,311 INFO [train.py:1039] (0/4) Epoch 13, batch 700, loss[loss=0.1867, simple_loss=0.2751, pruned_loss=0.04915, over 24680.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.265, pruned_loss=0.06127, over 4543877.31 frames. ], batch size: 73, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:57:34,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 16:57:35,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 16:57:39,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 16:57:40,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:42,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:57:43,205 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.57 vs. limit=15.0 2023-09-29 16:57:45,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 16:57:49,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:50,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:57:52,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:54,853 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=429700.0, ans=0.0 2023-09-29 16:57:56,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:57:56,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:58:00,408 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.24 vs. limit=12.0 2023-09-29 16:58:01,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:58:04,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:58:04,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:58:05,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 16:58:06,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=429766.6666666667, ans=0.125 2023-09-29 16:58:07,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 16:58:12,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:58:12,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:58:13,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:58:18,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:58:18,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 16:58:24,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:24,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:58:24,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 16:58:28,724 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.71 vs. limit=6.0 2023-09-29 16:58:29,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:58:29,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:32,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:58:38,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:58:40,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 16:58:43,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 16:58:43,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 16:58:45,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:48,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=429900.0, ans=0.125 2023-09-29 16:58:50,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:58:50,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:58:50,852 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.09 vs. limit=15.0 2023-09-29 16:58:53,028 INFO [train.py:1039] (0/4) Epoch 13, batch 750, loss[loss=0.1904, simple_loss=0.2559, pruned_loss=0.06243, over 23893.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2643, pruned_loss=0.06085, over 4573713.39 frames. ], batch size: 195, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 16:58:53,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:53,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 16:58:57,112 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-09-29 16:58:57,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 16:58:58,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 16:58:58,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 16:58:58,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 16:58:58,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 16:58:59,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:59:00,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 16:59:02,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:04,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:07,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:08,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:08,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:59:08,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:08,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=430033.3333333333, ans=0.0 2023-09-29 16:59:09,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=430033.3333333333, ans=0.125 2023-09-29 16:59:11,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:59:13,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:59:15,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:59:16,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:16,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:16,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 16:59:18,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:59:19,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:21,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:24,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:59:26,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 16:59:26,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:59:27,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 16:59:27,966 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 16:59:28,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=430100.0, ans=0.0 2023-09-29 16:59:29,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 16:59:29,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:59:29,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:59:33,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:59:43,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:43,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:43,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:59:44,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:45,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:46,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 16:59:46,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:59:48,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:59:49,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:59:51,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:59:52,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 16:59:54,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:58,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:59,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:59:59,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:03,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:00:08,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 17:00:08,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:09,479 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.868e+02 2.099e+02 2.383e+02 3.939e+02, threshold=4.199e+02, percent-clipped=0.0 2023-09-29 17:00:09,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:11,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:13,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:15,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:15,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:00:16,489 INFO [train.py:1039] (0/4) Epoch 13, batch 800, loss[loss=0.1849, simple_loss=0.2726, pruned_loss=0.04859, over 24584.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2651, pruned_loss=0.06098, over 4616619.00 frames. ], batch size: 71, lr: 8.09e-03, grad_scale: 32.0 2023-09-29 17:00:24,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:24,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:25,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:25,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:25,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:27,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:30,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:35,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:36,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:00:37,616 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.77 vs. limit=22.5 2023-09-29 17:00:38,646 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=430366.6666666667, ans=0.1 2023-09-29 17:00:39,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 17:00:40,807 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.19 vs. limit=10.0 2023-09-29 17:00:41,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:41,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:43,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:00:43,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:43,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 17:00:43,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:45,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 17:00:47,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:50,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:51,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:52,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:52,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=430433.3333333333, ans=0.125 2023-09-29 17:00:53,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:54,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:59,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:00:59,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:00:59,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 17:01:01,526 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 17:01:01,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 17:01:01,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:01:01,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:03,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:03,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:03,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=430500.0, ans=0.2 2023-09-29 17:01:08,512 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 17:01:09,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 17:01:10,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:01:13,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:01:13,536 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:01:17,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:01:21,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:22,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 17:01:23,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:01:25,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 17:01:30,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:33,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:01:33,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 17:01:33,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:01:34,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:36,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 17:01:36,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:38,264 INFO [train.py:1039] (0/4) Epoch 13, batch 850, loss[loss=0.1739, simple_loss=0.243, pruned_loss=0.05244, over 24300.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2662, pruned_loss=0.06158, over 4640501.30 frames. ], batch size: 56, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 17:01:38,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:01:39,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:40,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=430633.3333333333, ans=0.2 2023-09-29 17:01:41,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:01:42,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:45,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 17:01:45,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 17:01:45,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 17:01:47,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:47,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:01:51,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:52,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:52,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:01:54,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=430700.0, ans=0.0 2023-09-29 17:01:56,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:56,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:57,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 17:02:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 17:02:03,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:02:05,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 17:02:08,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 17:02:10,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 17:02:13,098 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.07 vs. limit=15.0 2023-09-29 17:02:13,580 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 17:02:13,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:13,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:02:13,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:02:17,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:18,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:20,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 17:02:21,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:21,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:24,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:02:24,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:02:25,003 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=430766.6666666667, ans=0.0 2023-09-29 17:02:27,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:02:27,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:02:27,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 17:02:31,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:02:31,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:32,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:02:32,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:34,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:35,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=430833.3333333333, ans=0.125 2023-09-29 17:02:37,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:38,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:02:40,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:02:41,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:02:41,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:02:45,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=430900.0, ans=0.0 2023-09-29 17:02:51,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:02:51,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:53,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 17:02:53,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:02:53,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:57,225 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 2.022e+02 2.303e+02 2.741e+02 5.777e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 17:02:57,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 17:02:59,878 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=430900.0, ans=0.0 2023-09-29 17:03:02,576 INFO [train.py:1039] (0/4) Epoch 13, batch 900, loss[loss=0.203, simple_loss=0.2852, pruned_loss=0.06033, over 24647.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2676, pruned_loss=0.06245, over 4654278.10 frames. ], batch size: 68, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:03:05,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:03:05,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=430966.6666666667, ans=0.125 2023-09-29 17:03:07,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:07,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 17:03:10,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:03:10,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 17:03:11,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:03:13,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:03:13,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:03:14,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:03:26,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:03:26,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:26,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:03:29,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:33,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=431100.0, ans=0.1 2023-09-29 17:03:35,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 17:03:37,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:03:42,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:03:42,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:03:43,982 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 17:03:44,994 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.17 vs. limit=15.0 2023-09-29 17:03:45,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 17:03:53,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:03:53,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:03:53,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:03:59,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:01,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:03,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 17:04:03,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:04:08,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 17:04:10,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:04:11,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:12,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:04:13,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:18,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 17:04:18,233 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 17:04:19,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:04:19,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 17:04:20,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=431233.3333333333, ans=0.125 2023-09-29 17:04:21,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:22,901 INFO [train.py:1039] (0/4) Epoch 13, batch 950, loss[loss=0.1981, simple_loss=0.2764, pruned_loss=0.05987, over 24560.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2672, pruned_loss=0.06181, over 4684539.33 frames. ], batch size: 71, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:04:24,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 17:04:29,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:31,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:04:36,512 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=431300.0, ans=0.2 2023-09-29 17:04:38,182 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 17:04:40,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:41,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:04:43,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:43,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:04:43,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 17:04:45,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:04:47,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:47,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 17:04:48,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:49,494 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.15 vs. limit=10.0 2023-09-29 17:04:53,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:53,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:53,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:54,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 17:04:56,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:04:58,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:05:01,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:05:07,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:05:11,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 17:05:14,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:05:14,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:05:15,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:16,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:16,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:05:18,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=431500.0, ans=0.125 2023-09-29 17:05:19,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 17:05:19,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:05:24,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:24,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:26,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 17:05:26,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:26,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:05:26,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 17:05:31,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:05:34,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:39,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=431566.6666666667, ans=0.1 2023-09-29 17:05:40,316 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.121e+02 2.375e+02 2.805e+02 4.363e+02, threshold=4.749e+02, percent-clipped=0.0 2023-09-29 17:05:40,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:05:41,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 17:05:41,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 17:05:44,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=431633.3333333333, ans=0.1 2023-09-29 17:05:46,078 INFO [train.py:1039] (0/4) Epoch 13, batch 1000, loss[loss=0.207, simple_loss=0.2616, pruned_loss=0.07623, over 23777.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2664, pruned_loss=0.06123, over 4696960.79 frames. ], batch size: 212, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:05:46,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:48,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 17:05:48,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:54,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:05:56,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 17:05:56,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 17:06:01,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:01,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:06:03,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:06,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 17:06:10,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 17:06:11,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 17:06:12,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:14,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 17:06:15,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 17:06:17,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 17:06:17,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:19,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:27,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:28,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:06:28,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:30,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:30,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 17:06:31,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:32,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:06:33,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:33,496 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 17:06:38,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 17:06:39,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 17:06:39,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=431833.3333333333, ans=0.125 2023-09-29 17:06:41,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 17:06:44,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:06:46,315 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=431833.3333333333, ans=0.125 2023-09-29 17:06:51,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:51,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:06:51,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:54,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:06:56,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 17:06:57,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:06:58,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 17:07:00,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 17:07:00,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:00,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:07:01,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:07:04,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:07:07,642 INFO [train.py:1039] (0/4) Epoch 13, batch 1050, loss[loss=0.1914, simple_loss=0.2477, pruned_loss=0.06751, over 23472.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2656, pruned_loss=0.06112, over 4704014.99 frames. ], batch size: 285, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:07:07,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:10,468 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=431966.6666666667, ans=0.07 2023-09-29 17:07:11,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:07:11,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:07:15,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:07:16,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:18,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:19,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:07:21,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:07:24,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:07:24,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:07:24,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=432033.3333333333, ans=0.125 2023-09-29 17:07:26,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:07:26,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:07:27,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 17:07:28,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:28,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 17:07:33,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:33,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 17:07:33,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:07:36,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=432033.3333333333, ans=0.125 2023-09-29 17:07:39,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:40,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:07:41,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:42,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 17:07:44,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 17:07:44,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:49,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 17:07:50,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 17:07:52,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:55,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:07:57,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:07:57,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:07:57,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:08:02,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:08:05,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 17:08:09,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 17:08:09,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 17:08:09,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:09,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:08:10,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 17:08:15,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:08:18,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:18,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:18,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:18,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:19,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=432233.3333333333, ans=0.125 2023-09-29 17:08:20,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=432233.3333333333, ans=0.125 2023-09-29 17:08:24,666 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.979e+02 2.212e+02 2.486e+02 3.871e+02, threshold=4.425e+02, percent-clipped=0.0 2023-09-29 17:08:24,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:24,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 17:08:26,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:26,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 17:08:26,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 17:08:27,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:08:29,428 INFO [train.py:1039] (0/4) Epoch 13, batch 1100, loss[loss=0.2119, simple_loss=0.2964, pruned_loss=0.06366, over 24562.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2652, pruned_loss=0.06081, over 4709898.49 frames. ], batch size: 71, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:08:29,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:08:36,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:08:41,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:08:43,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:08:43,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:43,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 17:08:45,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:08:48,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:08:49,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:08:53,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:08:53,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 17:08:54,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:08:54,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:54,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:58,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:08:59,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:09:05,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:09:08,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 17:09:10,210 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 17:09:10,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:13,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:15,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:09:15,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:09:17,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 17:09:17,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:09:18,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:09:18,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:09:18,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:18,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 17:09:25,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:09:26,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 17:09:28,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:09:31,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:09:35,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=432566.6666666667, ans=0.0 2023-09-29 17:09:36,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 17:09:36,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:09:38,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:39,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:09:41,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:41,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 17:09:42,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:09:42,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:43,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 17:09:43,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:09:45,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 17:09:48,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:09:48,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:09:50,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:09:53,079 INFO [train.py:1039] (0/4) Epoch 13, batch 1150, loss[loss=0.1962, simple_loss=0.2731, pruned_loss=0.05965, over 23741.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2666, pruned_loss=0.06191, over 4694787.21 frames. ], batch size: 85, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:09:54,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:09:57,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:10:01,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:01,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:10:01,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 17:10:02,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:04,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 17:10:07,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:07,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:10:12,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 17:10:15,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:16,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=432700.0, ans=0.125 2023-09-29 17:10:17,914 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.57 vs. limit=15.0 2023-09-29 17:10:20,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:21,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:21,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 17:10:21,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:10:23,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:27,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 17:10:28,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:29,577 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.11 vs. limit=12.0 2023-09-29 17:10:30,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:38,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:42,962 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.68 vs. limit=15.0 2023-09-29 17:10:43,448 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.92 vs. limit=15.0 2023-09-29 17:10:45,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:46,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 17:10:46,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:46,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:53,241 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 17:10:54,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:03,413 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 17:11:06,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:07,129 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.65 vs. limit=12.0 2023-09-29 17:11:08,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:11:09,441 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.852e+02 2.092e+02 2.448e+02 3.672e+02, threshold=4.183e+02, percent-clipped=0.0 2023-09-29 17:11:09,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:11:09,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:11:13,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:15,565 INFO [train.py:1039] (0/4) Epoch 13, batch 1200, loss[loss=0.2013, simple_loss=0.2765, pruned_loss=0.06304, over 24060.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2669, pruned_loss=0.06171, over 4709412.80 frames. ], batch size: 80, lr: 8.07e-03, grad_scale: 32.0 2023-09-29 17:11:17,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:11:17,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:11:18,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:18,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:20,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:11:21,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:11:22,164 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=432966.6666666667, ans=0.07 2023-09-29 17:11:23,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:11:23,666 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:11:24,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:24,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:29,242 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 17:11:31,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 17:11:36,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:11:39,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:11:41,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:41,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=433033.3333333333, ans=0.1 2023-09-29 17:11:43,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:11:43,139 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 17:11:44,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:51,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:11:51,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:11:53,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 17:11:54,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:11:58,221 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=433100.0, ans=0.0 2023-09-29 17:11:59,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 17:12:03,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 17:12:04,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:12:05,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:12:05,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=433166.6666666667, ans=0.1 2023-09-29 17:12:07,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:09,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:12:11,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:12:11,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:12:13,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:12:13,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 17:12:13,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:12:15,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:15,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:12:18,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:18,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:22,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:12:24,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:12:27,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 17:12:32,418 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 17:12:34,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:12:36,884 INFO [train.py:1039] (0/4) Epoch 13, batch 1250, loss[loss=0.1766, simple_loss=0.2559, pruned_loss=0.04858, over 24416.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2684, pruned_loss=0.06244, over 4713930.01 frames. ], batch size: 63, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:12:37,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:37,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=433300.0, ans=0.1 2023-09-29 17:12:38,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:12:40,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:41,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 17:12:47,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:12:47,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:12:49,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 17:12:50,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:12:52,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:12:59,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:12:59,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:01,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:13:01,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:02,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:13:04,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:13:04,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:13:04,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:06,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:06,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:09,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:10,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:13:11,303 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:13:12,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=433433.3333333333, ans=0.5 2023-09-29 17:13:16,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 17:13:17,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:13:20,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.whiten.whitening_limit, batch_count=433433.3333333333, ans=12.0 2023-09-29 17:13:21,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:21,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 17:13:22,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:22,801 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 17:13:23,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=433433.3333333333, ans=0.125 2023-09-29 17:13:24,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:24,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:29,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:13:37,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 17:13:37,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 17:13:37,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 17:13:39,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:13:39,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 17:13:39,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:42,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:13:44,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:13:45,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 17:13:45,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:13:47,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:13:47,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:13:48,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:50,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 17:13:50,936 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.87 vs. limit=15.0 2023-09-29 17:13:52,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:54,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:13:55,654 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.895e+02 2.072e+02 2.279e+02 3.563e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-29 17:13:55,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:13:57,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:14:00,455 INFO [train.py:1039] (0/4) Epoch 13, batch 1300, loss[loss=0.1843, simple_loss=0.2699, pruned_loss=0.0494, over 24567.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2687, pruned_loss=0.06224, over 4715742.99 frames. ], batch size: 71, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:14:02,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:14:02,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 17:14:07,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:10,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:14:11,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:13,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:14:13,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:14:15,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 17:14:19,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:14:21,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:14:23,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 17:14:27,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:14:31,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:33,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:33,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:36,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:36,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:14:38,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:14:38,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 17:14:46,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:14:46,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:14:46,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 17:14:48,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:14:48,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:14:51,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:51,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 17:14:52,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:53,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 17:14:55,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:59,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:59,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:15:02,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 17:15:04,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 17:15:06,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 17:15:10,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:15:14,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 17:15:15,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:22,685 INFO [train.py:1039] (0/4) Epoch 13, batch 1350, loss[loss=0.2046, simple_loss=0.2516, pruned_loss=0.07875, over 19543.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2672, pruned_loss=0.06191, over 4715379.08 frames. ], batch size: 388, lr: 8.06e-03, grad_scale: 16.0 2023-09-29 17:15:22,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 17:15:25,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:28,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:32,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=433966.6666666667, ans=0.2 2023-09-29 17:15:33,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:33,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:36,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:15:36,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:40,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:42,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 17:15:43,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:15:43,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:15:46,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 17:15:47,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:15:49,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:15:49,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 17:15:50,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 17:15:53,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 17:15:54,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:55,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 17:16:07,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:17,769 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.62 vs. limit=15.0 2023-09-29 17:16:18,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:18,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:19,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 17:16:22,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:23,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 17:16:23,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:16:23,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:16:28,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:16:30,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 17:16:31,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:16:37,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 17:16:40,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 17:16:41,599 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.935e+02 2.126e+02 2.533e+02 4.347e+02, threshold=4.251e+02, percent-clipped=1.0 2023-09-29 17:16:43,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 17:16:44,928 INFO [train.py:1039] (0/4) Epoch 13, batch 1400, loss[loss=0.1888, simple_loss=0.2555, pruned_loss=0.06112, over 24439.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2659, pruned_loss=0.06138, over 4720533.84 frames. ], batch size: 58, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:16:47,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:47,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=434300.0, ans=0.1 2023-09-29 17:16:50,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:16:52,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:16:56,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 17:16:58,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 17:17:10,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:17:11,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:14,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:17:14,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:17:18,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:17:20,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:17:22,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=434433.3333333333, ans=0.125 2023-09-29 17:17:30,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:30,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:34,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 17:17:34,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:17:36,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:17:36,978 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.99 vs. limit=10.0 2023-09-29 17:17:37,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:17:39,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:40,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:17:40,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:17:40,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=434500.0, ans=0.125 2023-09-29 17:17:41,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:17:43,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 17:17:43,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:17:47,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:51,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:18:01,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 17:18:03,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:18:03,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:18:06,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 17:18:07,625 INFO [train.py:1039] (0/4) Epoch 13, batch 1450, loss[loss=0.1939, simple_loss=0.2781, pruned_loss=0.05485, over 24459.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2659, pruned_loss=0.06112, over 4723944.24 frames. ], batch size: 69, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:18:07,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:12,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:18:14,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:18:14,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=434633.3333333333, ans=0.125 2023-09-29 17:18:17,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:18:17,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:17,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:18:22,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:22,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:18:25,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:18:25,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 17:18:27,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:18:27,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 17:18:29,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:29,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:29,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 17:18:31,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:18:33,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:18:33,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 17:18:33,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:34,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:18:35,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=434700.0, ans=6.0 2023-09-29 17:18:36,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:38,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:38,453 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:18:42,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:18:44,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:18:45,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:45,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:48,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:48,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:18:48,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:50,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:18:53,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 17:18:55,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:18:57,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=434833.3333333333, ans=0.1 2023-09-29 17:19:00,119 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 17:19:02,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:03,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:19:03,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:05,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 17:19:06,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=434833.3333333333, ans=0.0 2023-09-29 17:19:06,325 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=434833.3333333333, ans=0.0 2023-09-29 17:19:09,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:11,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 17:19:14,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 17:19:15,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:17,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:18,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:20,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 17:19:22,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 17:19:23,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 17:19:25,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:25,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:19:27,168 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.823e+02 1.982e+02 2.343e+02 3.097e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-29 17:19:30,427 INFO [train.py:1039] (0/4) Epoch 13, batch 1500, loss[loss=0.1821, simple_loss=0.2639, pruned_loss=0.05017, over 24286.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2652, pruned_loss=0.06085, over 4724887.58 frames. ], batch size: 61, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:19:38,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 17:19:39,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:19:39,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:19:40,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:42,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:42,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:19:44,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 17:19:45,140 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.44 vs. limit=15.0 2023-09-29 17:19:45,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:19:45,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:19:45,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:47,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:47,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:19:49,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:54,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:54,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 17:19:55,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:19:55,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:19:57,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:58,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=435033.3333333333, ans=0.05 2023-09-29 17:20:01,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 17:20:02,956 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.05 vs. limit=15.0 2023-09-29 17:20:06,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 17:20:08,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:20:08,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 17:20:08,930 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=435100.0, ans=0.07 2023-09-29 17:20:11,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:20:13,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:14,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:20:16,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:20:18,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 17:20:18,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:20:18,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:20,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 17:20:20,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:26,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:20:26,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 17:20:32,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:20:34,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:20:36,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.16 vs. limit=5.0 2023-09-29 17:20:37,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=435233.3333333333, ans=0.1 2023-09-29 17:20:39,156 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 17:20:39,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:40,601 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 17:20:40,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:20:42,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:20:42,781 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 17:20:44,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:20:47,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 17:20:48,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:52,186 INFO [train.py:1039] (0/4) Epoch 13, batch 1550, loss[loss=0.2097, simple_loss=0.2679, pruned_loss=0.07575, over 23700.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2659, pruned_loss=0.06117, over 4720699.58 frames. ], batch size: 164, lr: 8.04e-03, grad_scale: 16.0 2023-09-29 17:20:54,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:54,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:54,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:55,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:56,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:57,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 17:20:57,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 17:20:59,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:21:00,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 17:21:00,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 17:21:02,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:04,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:05,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:05,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:21:07,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:07,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:10,258 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 17:21:10,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:10,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:21:10,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:21:15,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:21:15,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 17:21:16,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:16,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 17:21:19,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 17:21:19,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 17:21:19,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:19,564 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=435366.6666666667, ans=0.125 2023-09-29 17:21:20,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:25,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:21:27,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 17:21:27,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 17:21:29,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=435433.3333333333, ans=0.1 2023-09-29 17:21:29,888 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.26 vs. limit=15.0 2023-09-29 17:21:32,993 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=435433.3333333333, ans=0.125 2023-09-29 17:21:35,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:40,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:40,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:21:40,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:21:40,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 17:21:46,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:21:49,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:51,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:21:53,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:21:53,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:55,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 17:21:55,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:21:57,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:21:58,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:58,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:21:58,688 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 17:22:01,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:04,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=435566.6666666667, ans=0.1 2023-09-29 17:22:08,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 17:22:11,513 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 2.000e+02 2.251e+02 2.787e+02 4.721e+02, threshold=4.502e+02, percent-clipped=2.0 2023-09-29 17:22:11,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:13,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:22:14,589 INFO [train.py:1039] (0/4) Epoch 13, batch 1600, loss[loss=0.2519, simple_loss=0.3044, pruned_loss=0.0997, over 19468.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2675, pruned_loss=0.06288, over 4699346.38 frames. ], batch size: 388, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:22:14,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 17:22:16,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:22:17,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:17,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:22:17,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:22:18,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:22:22,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:22,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 17:22:23,350 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:22:24,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 17:22:26,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 17:22:28,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:22:30,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 17:22:31,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:22:34,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:22:36,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=435700.0, ans=0.125 2023-09-29 17:22:39,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:22:41,859 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=435700.0, ans=0.2 2023-09-29 17:22:44,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 17:22:46,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:22:46,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 17:22:46,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:47,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 17:22:51,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 17:22:58,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:22:58,845 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.26 vs. limit=10.0 2023-09-29 17:23:00,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 17:23:00,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:23:02,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:02,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:23:06,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 17:23:08,989 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.63 vs. limit=10.0 2023-09-29 17:23:09,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 17:23:10,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=435833.3333333333, ans=0.0 2023-09-29 17:23:11,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:23:13,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:14,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:14,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:23:16,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:23:17,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:23:19,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:23:25,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:27,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:23:30,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 17:23:30,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:23:30,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 17:23:35,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:37,196 INFO [train.py:1039] (0/4) Epoch 13, batch 1650, loss[loss=0.2056, simple_loss=0.28, pruned_loss=0.06561, over 23344.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2676, pruned_loss=0.06283, over 4698911.18 frames. ], batch size: 93, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:23:37,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:23:37,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:23:37,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 17:23:38,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 17:23:38,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 17:23:38,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 17:23:40,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=435966.6666666667, ans=0.015 2023-09-29 17:23:42,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=435966.6666666667, ans=0.0 2023-09-29 17:23:43,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:43,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:45,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:23:45,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:23:48,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:50,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 17:23:53,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:54,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:54,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:23:54,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:23:54,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 17:23:54,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 17:23:58,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=436033.3333333333, ans=0.0 2023-09-29 17:23:58,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=436033.3333333333, ans=0.125 2023-09-29 17:23:59,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:24:01,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:24:11,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 17:24:12,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=436100.0, ans=0.125 2023-09-29 17:24:13,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:14,278 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.57 vs. limit=10.0 2023-09-29 17:24:14,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 17:24:18,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:20,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:24:20,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:24:21,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:23,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:24:23,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:26,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:24:27,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:29,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:29,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:30,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:30,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:24:36,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:36,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 17:24:39,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:39,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 17:24:41,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 17:24:41,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 17:24:43,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:43,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:24:43,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:43,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:43,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 17:24:50,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:51,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:24:51,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:55,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 17:24:56,551 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.021e+02 2.239e+02 2.775e+02 4.189e+02, threshold=4.478e+02, percent-clipped=0.0 2023-09-29 17:24:57,094 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=436233.3333333333, ans=0.2 2023-09-29 17:24:59,924 INFO [train.py:1039] (0/4) Epoch 13, batch 1700, loss[loss=0.1887, simple_loss=0.2443, pruned_loss=0.06655, over 22699.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2665, pruned_loss=0.06226, over 4707132.98 frames. ], batch size: 322, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:25:00,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:25:00,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:25:00,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 17:25:00,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:00,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:25:00,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:03,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:25:04,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:25:04,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 17:25:07,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:25:16,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:18,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:25:20,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=436366.6666666667, ans=0.2 2023-09-29 17:25:24,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:25:24,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:26,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:26,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:29,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 17:25:33,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:25:33,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:33,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:25:34,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:25:36,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 17:25:37,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 17:25:39,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:40,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 17:25:44,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:25:46,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=436433.3333333333, ans=0.125 2023-09-29 17:25:51,352 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=436500.0, ans=0.0 2023-09-29 17:25:53,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:25:53,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:25:54,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:56,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:25:56,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 17:25:57,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:59,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:59,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 17:26:01,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:01,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:01,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:01,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:04,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:04,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:26:04,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:04,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:26:05,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:09,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:11,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 17:26:14,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:16,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:17,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 17:26:22,897 INFO [train.py:1039] (0/4) Epoch 13, batch 1750, loss[loss=0.1989, simple_loss=0.2629, pruned_loss=0.06746, over 23625.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2651, pruned_loss=0.06248, over 4687918.38 frames. ], batch size: 149, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:26:24,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:26,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:28,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:26:28,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 17:26:28,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:32,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:26:32,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:39,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 17:26:40,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:43,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 17:26:44,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:45,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:26:48,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:26:50,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 17:26:52,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:52,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 17:26:57,860 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=436766.6666666667, ans=0.2 2023-09-29 17:27:00,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:27:04,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:04,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:06,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=436766.6666666667, ans=0.125 2023-09-29 17:27:07,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:07,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:11,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:27:12,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:14,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:14,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:27:15,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 17:27:17,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:20,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 17:27:22,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:22,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:22,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:27:23,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=436833.3333333333, ans=0.0 2023-09-29 17:27:28,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:27:28,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:27:29,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:32,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:37,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:40,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:27:42,027 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.899e+02 2.041e+02 2.421e+02 3.023e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-29 17:27:43,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:27:45,433 INFO [train.py:1039] (0/4) Epoch 13, batch 1800, loss[loss=0.1613, simple_loss=0.2358, pruned_loss=0.04345, over 24607.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2643, pruned_loss=0.06188, over 4689297.23 frames. ], batch size: 60, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:27:45,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 17:27:45,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:48,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:27:48,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:27:48,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:27:48,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:27:48,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:27:51,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:27:51,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:55,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:27:56,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:59,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:28:01,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:28:04,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:08,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:28:12,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:28:12,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 17:28:13,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:18,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:22,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 17:28:22,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=437100.0, ans=0.0 2023-09-29 17:28:22,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=437100.0, ans=0.2 2023-09-29 17:28:23,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 17:28:23,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 17:28:25,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:26,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:26,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:28:28,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:28:32,197 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.76 vs. limit=15.0 2023-09-29 17:28:32,926 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 17:28:34,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:28:38,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:38,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 17:28:39,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 17:28:41,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:28:43,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:28:45,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:28:48,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 17:28:48,611 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=437166.6666666667, ans=0.2 2023-09-29 17:28:54,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:28:56,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 17:28:56,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:28:56,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:56,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:28:58,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 17:29:01,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:29:01,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:03,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 17:29:03,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:04,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:04,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:29:04,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,722 INFO [train.py:1039] (0/4) Epoch 13, batch 1850, loss[loss=0.194, simple_loss=0.2638, pruned_loss=0.06214, over 23201.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2651, pruned_loss=0.06194, over 4692410.30 frames. ], batch size: 119, lr: 8.03e-03, grad_scale: 16.0 2023-09-29 17:29:07,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:29:09,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:29:10,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:15,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:29:15,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:29:20,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=437300.0, ans=0.125 2023-09-29 17:29:22,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:29:23,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 17:29:23,630 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.84 vs. limit=22.5 2023-09-29 17:29:26,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 17:29:29,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 17:29:32,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:33,051 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=437366.6666666667, ans=0.1 2023-09-29 17:29:34,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 17:29:34,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 17:29:43,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:29:47,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 17:29:48,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:29:49,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:29:52,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 17:29:52,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:54,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:29:54,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:29:58,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:30:00,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=437500.0, ans=0.1 2023-09-29 17:30:01,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:05,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:30:05,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:05,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:30:05,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:08,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:10,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:30:13,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 17:30:13,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:16,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:30:17,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:30:17,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 17:30:17,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 17:30:19,374 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 17:30:21,373 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 17:30:24,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:30:24,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:30:24,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:24,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:25,750 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 17:30:25,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:30:27,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:27,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=437566.6666666667, ans=0.125 2023-09-29 17:30:28,476 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.931e+02 2.222e+02 2.775e+02 3.962e+02, threshold=4.445e+02, percent-clipped=0.0 2023-09-29 17:30:28,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:30:28,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:30:30,314 INFO [train.py:1039] (0/4) Epoch 13, batch 1900, loss[loss=0.2025, simple_loss=0.2618, pruned_loss=0.07163, over 23837.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2656, pruned_loss=0.06125, over 4704246.56 frames. ], batch size: 195, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:30:30,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:30:30,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 17:30:33,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:33,577 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 17:30:33,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:30:35,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:40,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:42,426 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=437633.3333333333, ans=0.0 2023-09-29 17:30:43,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:30:43,727 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 17:30:45,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 17:30:45,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:46,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:46,805 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 17:30:48,229 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 17:30:51,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 17:30:52,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:30:56,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 17:30:57,454 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.92 vs. limit=22.5 2023-09-29 17:31:00,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 17:31:09,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=437766.6666666667, ans=0.0 2023-09-29 17:31:11,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 17:31:13,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 17:31:14,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:31:14,920 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 17:31:14,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 17:31:14,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 17:31:16,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 17:31:16,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:31:21,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 17:31:25,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:31:27,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:27,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 17:31:28,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:31:29,025 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=437833.3333333333, ans=0.1 2023-09-29 17:31:34,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 17:31:36,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:41,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:31:41,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:31:41,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:31:42,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:31:46,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:31:46,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:31:46,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:31:49,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:49,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:31:52,144 INFO [train.py:1039] (0/4) Epoch 13, batch 1950, loss[loss=0.1966, simple_loss=0.2786, pruned_loss=0.0573, over 24307.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2668, pruned_loss=0.06167, over 4714942.19 frames. ], batch size: 74, lr: 8.02e-03, grad_scale: 8.0 2023-09-29 17:31:52,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:31:52,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:52,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:53,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:55,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:00,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:32:00,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:00,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:32:01,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 17:32:04,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:32:04,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:06,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:08,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:32:10,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:10,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:12,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:16,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:16,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:32:17,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:32:17,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:20,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:24,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:32:24,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:24,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:32:24,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 17:32:24,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:32:24,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:32:25,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:26,439 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.36 vs. limit=15.0 2023-09-29 17:32:29,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:30,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:32:35,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:32:38,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:32:38,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:32:40,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 17:32:40,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:32:40,599 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=438166.6666666667, ans=0.0 2023-09-29 17:32:44,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=438166.6666666667, ans=0.125 2023-09-29 17:32:45,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:45,879 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:32:47,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:32:48,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:32:56,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:57,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:59,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:00,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:01,007 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:33:03,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:33:03,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:05,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 17:33:05,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:33:06,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:33:08,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 17:33:11,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:14,898 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.174e+02 2.503e+02 4.017e+02, threshold=4.347e+02, percent-clipped=0.0 2023-09-29 17:33:14,940 INFO [train.py:1039] (0/4) Epoch 13, batch 2000, loss[loss=0.2011, simple_loss=0.2724, pruned_loss=0.06494, over 24463.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2681, pruned_loss=0.06269, over 4711010.30 frames. ], batch size: 63, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:33:15,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:33:17,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:33:17,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:33:18,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:33:21,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:24,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 17:33:25,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:33:28,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:33:31,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 17:33:33,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:33:33,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:36,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:33:36,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=438366.6666666667, ans=0.0 2023-09-29 17:33:37,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 17:33:40,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:43,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:43,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:44,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 17:33:45,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:33:46,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 17:33:46,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:33:50,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:33:52,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:33:52,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:54,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:33:54,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:33:56,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 17:33:59,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 17:34:00,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:34:00,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:04,703 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=438500.0, ans=0.0 2023-09-29 17:34:06,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:07,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:34:07,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:07,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:34:09,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:10,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:10,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:10,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:12,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:15,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:34:15,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 17:34:15,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=438500.0, ans=0.125 2023-09-29 17:34:21,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:34:23,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:25,859 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=438566.6666666667, ans=0.0 2023-09-29 17:34:26,379 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.73 vs. limit=15.0 2023-09-29 17:34:27,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:27,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:34:31,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=438566.6666666667, ans=0.125 2023-09-29 17:34:33,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:35,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:35,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:36,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:34:36,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:34:37,997 INFO [train.py:1039] (0/4) Epoch 13, batch 2050, loss[loss=0.1712, simple_loss=0.2534, pruned_loss=0.0445, over 24313.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2666, pruned_loss=0.06135, over 4719057.56 frames. ], batch size: 61, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:34:38,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:39,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:42,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:44,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:49,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:52,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:34:52,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:53,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:34:54,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 17:34:54,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:34:55,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:55,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:34:58,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=438700.0, ans=0.2 2023-09-29 17:35:03,077 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=438700.0, ans=0.125 2023-09-29 17:35:09,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:09,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:11,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 17:35:13,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:14,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 17:35:14,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:19,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:20,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:22,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:35:22,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:24,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:35:25,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:35:25,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:35:30,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:32,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:35:35,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:35:35,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:35:39,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:35:44,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:35:44,802 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=438900.0, ans=0.0 2023-09-29 17:35:45,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 17:35:50,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:35:52,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:35:52,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=438900.0, ans=0.2 2023-09-29 17:35:53,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:35:55,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 17:35:59,775 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.857e+02 2.010e+02 2.339e+02 3.458e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-29 17:35:59,820 INFO [train.py:1039] (0/4) Epoch 13, batch 2100, loss[loss=0.1937, simple_loss=0.279, pruned_loss=0.05418, over 24425.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2655, pruned_loss=0.06094, over 4722460.04 frames. ], batch size: 69, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:35:59,952 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 17:35:59,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:01,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:01,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:03,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:36:03,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 17:36:03,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 17:36:05,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:36:08,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:36:09,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:36:11,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:11,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:36:11,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 17:36:11,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=438966.6666666667, ans=0.125 2023-09-29 17:36:14,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:36:14,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 17:36:14,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 17:36:15,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:17,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:36:17,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 17:36:17,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 17:36:23,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 17:36:23,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:27,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:36:27,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:30,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:36:31,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 17:36:32,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:32,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:36:32,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=439100.0, ans=0.125 2023-09-29 17:36:34,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 17:36:35,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:35,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 17:36:37,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 17:36:37,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 17:36:39,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:36:42,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:36:43,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:44,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:47,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:49,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:49,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 17:36:49,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:51,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:52,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:52,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 17:36:53,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 17:36:54,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 17:36:54,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=439166.6666666667, ans=0.2 2023-09-29 17:36:58,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:37:01,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:37:03,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 17:37:07,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:09,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:37:09,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:09,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:09,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:37:09,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:37:13,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:13,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:37:14,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:37:14,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:16,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 17:37:18,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 17:37:18,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:21,812 INFO [train.py:1039] (0/4) Epoch 13, batch 2150, loss[loss=0.1916, simple_loss=0.2649, pruned_loss=0.05917, over 23703.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2655, pruned_loss=0.06083, over 4722369.98 frames. ], batch size: 149, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:37:22,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:37:22,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:37:23,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:37:23,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:37:30,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:37:31,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:33,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:34,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:37:34,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:36,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:37:39,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:40,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:37:40,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:37:42,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:42,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 17:37:48,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:37:50,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:37:52,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:52,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:53,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:37:53,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:53,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:55,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:56,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 17:37:58,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:37:59,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:59,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:00,673 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.06 vs. limit=22.5 2023-09-29 17:38:01,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:38:01,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:38:02,382 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.91 vs. limit=15.0 2023-09-29 17:38:05,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:06,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:38:06,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:06,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 17:38:06,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:38:11,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:12,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:13,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:13,228 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=439500.0, ans=0.125 2023-09-29 17:38:14,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:38:14,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:16,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:16,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 17:38:18,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 17:38:19,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:38:20,000 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 17:38:21,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:21,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:38:23,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 17:38:23,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:38:23,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 17:38:23,601 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 17:38:23,602 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 17:38:23,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 17:38:25,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:26,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:38:26,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:38:28,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:29,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:38:31,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:31,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:38,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=439566.6666666667, ans=0.0 2023-09-29 17:38:38,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=439566.6666666667, ans=0.0 2023-09-29 17:38:39,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:38:40,481 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.71 vs. limit=15.0 2023-09-29 17:38:41,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 17:38:44,263 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.873e+02 2.053e+02 2.392e+02 4.399e+02, threshold=4.106e+02, percent-clipped=1.0 2023-09-29 17:38:44,306 INFO [train.py:1039] (0/4) Epoch 13, batch 2200, loss[loss=0.2052, simple_loss=0.2702, pruned_loss=0.07009, over 23669.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2655, pruned_loss=0.06007, over 4734171.22 frames. ], batch size: 232, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:38:44,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:38:49,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:50,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:38:52,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:52,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:38:57,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:57,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:57,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 17:39:03,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 17:39:03,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:39:08,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 17:39:11,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:12,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:13,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:39:18,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:39:19,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 17:39:20,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=439766.6666666667, ans=0.125 2023-09-29 17:39:23,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:39:26,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:28,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 17:39:29,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=439766.6666666667, ans=0.1 2023-09-29 17:39:31,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:39:33,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:35,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:39:37,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:38,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 17:39:40,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:41,349 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.54 vs. limit=15.0 2023-09-29 17:39:41,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 17:39:44,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:44,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:39:44,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:48,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:48,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:48,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:39:48,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=439900.0, ans=0.2 2023-09-29 17:39:49,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:39:51,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:39:53,679 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.38 vs. limit=15.0 2023-09-29 17:39:55,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:39:55,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:39:55,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=439900.0, ans=0.1 2023-09-29 17:39:57,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=439900.0, ans=0.125 2023-09-29 17:39:58,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:39:58,406 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 17:40:01,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:40:01,906 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 17:40:02,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:40:03,470 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 17:40:05,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:05,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:40:07,067 INFO [train.py:1039] (0/4) Epoch 13, batch 2250, loss[loss=0.1806, simple_loss=0.2552, pruned_loss=0.053, over 24575.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2663, pruned_loss=0.06009, over 4735116.57 frames. ], batch size: 60, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:40:08,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:08,774 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 17:40:09,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=439966.6666666667, ans=0.0 2023-09-29 17:40:11,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:40:13,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:19,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:40:21,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:40:25,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:27,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:27,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:28,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 17:40:30,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:40:30,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:40:33,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 17:40:34,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:40:34,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:34,708 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.75 vs. limit=22.5 2023-09-29 17:40:37,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:43,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:40:45,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:40:45,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:40:46,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 17:40:48,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:49,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:40:55,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:57,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:58,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:58,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:41:00,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:41:02,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:41:05,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=440166.6666666667, ans=0.0 2023-09-29 17:41:06,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:41:07,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:41:11,754 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=440233.3333333333, ans=0.1 2023-09-29 17:41:15,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:41:16,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:41:18,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:41:23,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:41:23,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=440233.3333333333, ans=0.0 2023-09-29 17:41:26,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:41:26,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 17:41:26,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:26,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:41:27,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 17:41:29,317 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.929e+02 2.161e+02 2.428e+02 3.244e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-29 17:41:29,361 INFO [train.py:1039] (0/4) Epoch 13, batch 2300, loss[loss=0.1825, simple_loss=0.2589, pruned_loss=0.05303, over 24309.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.267, pruned_loss=0.06093, over 4738557.15 frames. ], batch size: 61, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:41:32,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:41:33,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:38,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:38,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:41:42,398 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 17:41:43,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:52,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:41:52,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:41:54,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:41:54,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:54,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 17:41:55,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:41:58,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:41:59,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:41:59,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=440366.6666666667, ans=0.125 2023-09-29 17:42:02,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:42:06,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:42:08,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:10,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=440433.3333333333, ans=0.125 2023-09-29 17:42:10,520 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.48 vs. limit=15.0 2023-09-29 17:42:13,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:42:13,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:42:16,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:42:19,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:42:23,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:42:23,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:42:25,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:42:25,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 17:42:30,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:42:30,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:31,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:31,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:42:33,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:34,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 17:42:34,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:42:35,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 17:42:35,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:42:35,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:35,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 17:42:41,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:42:44,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:42:49,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:49,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:42:51,360 INFO [train.py:1039] (0/4) Epoch 13, batch 2350, loss[loss=0.1979, simple_loss=0.2616, pruned_loss=0.06712, over 23202.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2677, pruned_loss=0.06151, over 4738184.62 frames. ], batch size: 105, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:42:51,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:42:51,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:42:51,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:42:53,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:42:53,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 17:43:00,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:00,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 17:43:01,012 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=440633.3333333333, ans=0.125 2023-09-29 17:43:07,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 17:43:10,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:43:13,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:13,722 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=440700.0, ans=0.125 2023-09-29 17:43:14,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:14,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:15,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:15,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 17:43:18,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:43:25,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 17:43:26,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:30,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:43:30,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:43:33,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:43:35,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 17:43:35,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:43:39,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:39,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:43:39,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:43:42,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:43:42,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=440833.3333333333, ans=0.125 2023-09-29 17:43:43,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 17:43:45,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:46,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:46,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:43:49,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 17:43:49,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:43:54,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 17:43:54,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:43:59,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 17:44:04,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 17:44:04,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:44:04,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:44:05,798 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 17:44:05,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 17:44:07,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 17:44:10,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:44:12,061 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=440900.0, ans=0.0 2023-09-29 17:44:14,654 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.807e+02 2.059e+02 2.357e+02 3.650e+02, threshold=4.118e+02, percent-clipped=0.0 2023-09-29 17:44:14,697 INFO [train.py:1039] (0/4) Epoch 13, batch 2400, loss[loss=0.1883, simple_loss=0.2433, pruned_loss=0.0666, over 23586.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2687, pruned_loss=0.06221, over 4731833.73 frames. ], batch size: 256, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:44:14,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:44:17,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:44:20,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:44:21,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 17:44:21,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 17:44:24,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=440966.6666666667, ans=0.0 2023-09-29 17:44:30,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:44:30,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:44:32,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 17:44:32,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:44:32,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:33,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 17:44:40,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:42,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 17:44:47,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:44:50,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 17:44:50,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=441100.0, ans=0.125 2023-09-29 17:44:53,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:44:55,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:59,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:44:59,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 17:44:59,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:45:08,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:12,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:14,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:15,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=441166.6666666667, ans=0.1 2023-09-29 17:45:16,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:45:17,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:45:17,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:45:17,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:17,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:19,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:45:23,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:45:24,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:45:24,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 17:45:25,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 17:45:26,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:45:26,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:26,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 17:45:28,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 17:45:29,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 17:45:29,837 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 17:45:31,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 17:45:32,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:45:34,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:34,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:35,887 INFO [train.py:1039] (0/4) Epoch 13, batch 2450, loss[loss=0.18, simple_loss=0.2566, pruned_loss=0.05169, over 22055.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2667, pruned_loss=0.06168, over 4720189.35 frames. ], batch size: 48, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:45:36,009 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 17:45:36,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:37,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:45:42,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:45:42,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:48,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:48,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:50,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 17:45:53,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=441366.6666666667, ans=0.015 2023-09-29 17:45:56,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:56,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:59,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:45:59,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:45:59,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:45:59,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 17:46:04,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:05,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:46:07,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:46:10,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:46:10,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:10,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=441433.3333333333, ans=0.125 2023-09-29 17:46:11,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:12,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:46:13,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 17:46:15,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:46:15,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=441433.3333333333, ans=0.125 2023-09-29 17:46:23,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:24,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:24,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:25,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:46:25,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:26,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:46:28,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 17:46:28,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=441500.0, ans=0.0 2023-09-29 17:46:31,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:31,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:46:35,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:46:35,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:37,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=441500.0, ans=0.0 2023-09-29 17:46:39,951 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.35 vs. limit=15.0 2023-09-29 17:46:40,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:46:40,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 17:46:40,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=441566.6666666667, ans=0.125 2023-09-29 17:46:42,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:46:43,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:46:45,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 17:46:45,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:46:46,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:46:49,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:46:53,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:53,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:46:58,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 17:46:59,952 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.935e+02 2.161e+02 2.595e+02 3.888e+02, threshold=4.322e+02, percent-clipped=0.0 2023-09-29 17:46:59,996 INFO [train.py:1039] (0/4) Epoch 13, batch 2500, loss[loss=0.1882, simple_loss=0.2532, pruned_loss=0.06159, over 23746.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2659, pruned_loss=0.06088, over 4733226.72 frames. ], batch size: 179, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:47:00,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:47:06,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:08,316 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.47 vs. limit=12.0 2023-09-29 17:47:15,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:47:15,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:47:17,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:17,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 17:47:20,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=441700.0, ans=0.125 2023-09-29 17:47:25,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:47:25,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:47:27,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:47:29,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:47:30,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 17:47:30,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:30,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:32,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 17:47:32,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:32,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 17:47:33,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:38,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:47:39,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:41,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:47:41,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 17:47:41,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:47:44,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:47,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:51,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:53,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=441833.3333333333, ans=22.5 2023-09-29 17:47:54,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:47:58,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=441833.3333333333, ans=0.125 2023-09-29 17:48:00,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:48:05,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 17:48:05,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:05,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:06,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:48:06,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:48:08,480 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 17:48:08,481 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 17:48:08,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 17:48:12,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:15,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 17:48:17,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 17:48:17,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:48:18,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 17:48:22,042 INFO [train.py:1039] (0/4) Epoch 13, batch 2550, loss[loss=0.1901, simple_loss=0.2674, pruned_loss=0.05641, over 24474.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2657, pruned_loss=0.06046, over 4747483.30 frames. ], batch size: 66, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:48:22,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 17:48:25,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:26,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:48:26,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:48:28,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:30,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 17:48:30,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:48:36,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 17:48:39,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:48:42,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:43,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:43,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 17:48:44,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:48:44,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:48:44,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:47,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:48:47,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 17:48:48,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:48,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:48,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 17:48:48,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=442033.3333333333, ans=0.125 2023-09-29 17:48:51,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=442033.3333333333, ans=0.125 2023-09-29 17:49:00,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:49:05,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:05,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:05,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:49:07,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:49:11,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=442166.6666666667, ans=0.125 2023-09-29 17:49:12,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:49:15,012 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=442166.6666666667, ans=0.05 2023-09-29 17:49:16,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:49:17,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:49:17,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:49:17,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:49:19,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:49:20,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=442166.6666666667, ans=0.5 2023-09-29 17:49:22,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:23,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:26,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:49:26,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 17:49:26,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:49:27,122 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=442233.3333333333, ans=0.125 2023-09-29 17:49:28,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:29,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:49:31,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:49:32,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:37,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:49:41,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:44,610 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.935e+02 2.262e+02 2.614e+02 3.523e+02, threshold=4.524e+02, percent-clipped=0.0 2023-09-29 17:49:44,671 INFO [train.py:1039] (0/4) Epoch 13, batch 2600, loss[loss=0.1968, simple_loss=0.2747, pruned_loss=0.05944, over 24391.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2668, pruned_loss=0.06065, over 4751017.72 frames. ], batch size: 77, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:49:45,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=442300.0, ans=0.125 2023-09-29 17:49:46,293 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 17:49:49,380 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 17:49:50,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:49:50,817 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 17:49:50,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 17:49:50,965 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 17:49:54,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:56,101 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 17:49:56,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 17:49:57,225 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.60 vs. limit=12.0 2023-09-29 17:49:57,746 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 17:49:59,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:50:02,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 17:50:02,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 17:50:05,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:50:05,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 17:50:08,503 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 17:50:08,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 17:50:17,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:17,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:19,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:19,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 17:50:21,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:50:27,090 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 17:50:32,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:33,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:35,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 17:50:36,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:36,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:36,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 17:50:37,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=442500.0, ans=0.0 2023-09-29 17:50:38,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:50:38,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:50:41,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,860 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 17:50:45,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:50:50,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:52,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:50:52,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 17:50:54,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:56,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:50:57,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:51:03,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 17:51:04,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:07,433 INFO [train.py:1039] (0/4) Epoch 13, batch 2650, loss[loss=0.1648, simple_loss=0.2398, pruned_loss=0.04488, over 24345.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2678, pruned_loss=0.06133, over 4751875.27 frames. ], batch size: 56, lr: 7.98e-03, grad_scale: 16.0 2023-09-29 17:51:07,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:51:10,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 17:51:10,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:11,169 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=442633.3333333333, ans=0.0 2023-09-29 17:51:12,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:51:12,558 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 17:51:12,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:15,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:18,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:51:20,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:51:24,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:51:25,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 17:51:25,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:51:25,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:51:27,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 17:51:29,823 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 17:51:32,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:51:35,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 17:51:35,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:51:37,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 17:51:42,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:51:42,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:47,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 17:51:47,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 17:51:50,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:51:51,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=442766.6666666667, ans=0.05 2023-09-29 17:51:52,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=442766.6666666667, ans=0.035 2023-09-29 17:51:55,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 17:51:55,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:55,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:57,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:51:57,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:59,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:52:00,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:52:02,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=442833.3333333333, ans=0.1 2023-09-29 17:52:03,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:04,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:52:06,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:52:07,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:52:07,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:08,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=442833.3333333333, ans=0.05 2023-09-29 17:52:09,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:52:11,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:11,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:12,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:52:15,933 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.91 vs. limit=6.0 2023-09-29 17:52:16,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:16,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:52:16,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:18,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 17:52:19,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:52:22,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:24,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:24,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:26,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:52:26,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:29,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:52:29,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 17:52:31,154 INFO [train.py:1039] (0/4) Epoch 13, batch 2700, loss[loss=0.2112, simple_loss=0.2686, pruned_loss=0.0769, over 23630.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2688, pruned_loss=0.06134, over 4747451.74 frames. ], batch size: 256, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:52:32,537 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.954e+02 2.253e+02 2.566e+02 4.959e+02, threshold=4.505e+02, percent-clipped=1.0 2023-09-29 17:52:32,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:52:36,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 17:52:38,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:52:38,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:39,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:39,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:52:39,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:39,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:52:40,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:52:40,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 17:52:41,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:52:41,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:52:43,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:52:44,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:48,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:52:50,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 17:52:50,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:52:56,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:52:56,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:52:56,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=443033.3333333333, ans=0.2 2023-09-29 17:53:01,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:53:01,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:53:01,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:53:01,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:53:03,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:07,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:07,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:53:09,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:53:12,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:12,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:53:21,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=443166.6666666667, ans=0.1 2023-09-29 17:53:22,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=443166.6666666667, ans=0.1 2023-09-29 17:53:23,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:53:23,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:53:27,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:53:27,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:31,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:33,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:33,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:34,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:35,080 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=443166.6666666667, ans=0.1 2023-09-29 17:53:36,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:37,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:53:39,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:53:42,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:42,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:46,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 17:53:46,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:48,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:53:48,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 17:53:49,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 17:53:51,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:53,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:53:53,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:54,524 INFO [train.py:1039] (0/4) Epoch 13, batch 2750, loss[loss=0.165, simple_loss=0.2425, pruned_loss=0.0437, over 24616.00 frames. ], tot_loss[loss=0.1953, simple_loss=0.2682, pruned_loss=0.06125, over 4742891.71 frames. ], batch size: 60, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:53:58,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:58,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:53:58,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:01,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:54:01,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:54:01,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 17:54:01,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:54:02,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:54:08,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 17:54:10,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:54:12,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:13,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:14,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:54:15,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:54:16,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:54:17,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:18,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:18,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=443366.6666666667, ans=0.125 2023-09-29 17:54:20,180 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=443366.6666666667, ans=0.0 2023-09-29 17:54:21,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:54:21,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:54:23,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:54:23,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:26,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:54:35,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:37,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:54:37,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:38,155 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.30 vs. limit=15.0 2023-09-29 17:54:42,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:42,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:54:43,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:54:45,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=443500.0, ans=0.1 2023-09-29 17:54:51,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:54:51,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:51,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 17:54:57,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:57,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=443566.6666666667, ans=0.125 2023-09-29 17:54:58,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 17:55:00,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=443566.6666666667, ans=0.125 2023-09-29 17:55:04,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:55:06,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:55:06,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 17:55:07,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:09,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:55:09,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 17:55:11,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:55:14,467 INFO [train.py:1039] (0/4) Epoch 13, batch 2800, loss[loss=0.1966, simple_loss=0.2404, pruned_loss=0.07642, over 19198.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2669, pruned_loss=0.0608, over 4736307.20 frames. ], batch size: 388, lr: 7.97e-03, grad_scale: 32.0 2023-09-29 17:55:14,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 17:55:15,771 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.948e+02 2.222e+02 2.625e+02 4.530e+02, threshold=4.443e+02, percent-clipped=1.0 2023-09-29 17:55:15,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:16,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:55:17,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 17:55:17,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:17,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:21,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:21,109 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 17:55:21,110 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 17:55:23,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:26,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:55:26,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:55:30,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:55:33,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 17:55:35,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 17:55:36,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 17:55:36,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:38,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:55:38,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:55:41,386 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=443700.0, ans=0.125 2023-09-29 17:55:43,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:55:43,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:43,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:55:43,958 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.34 vs. limit=15.0 2023-09-29 17:55:44,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:55:47,010 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.42 vs. limit=15.0 2023-09-29 17:55:52,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:55:56,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:56,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=443766.6666666667, ans=0.125 2023-09-29 17:55:58,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:59,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:59,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:02,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=443766.6666666667, ans=0.125 2023-09-29 17:56:05,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:06,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 17:56:06,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:07,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:07,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:56:11,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:12,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:17,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:19,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:56:19,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:19,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:56:19,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=443900.0, ans=0.0 2023-09-29 17:56:19,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=443900.0, ans=0.125 2023-09-29 17:56:20,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:56:20,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:56:22,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:56:22,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 17:56:22,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:24,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:56:24,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:25,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 17:56:27,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:27,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:56:29,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:56:30,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 17:56:34,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=443900.0, ans=0.125 2023-09-29 17:56:36,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=443966.6666666667, ans=0.1 2023-09-29 17:56:37,556 INFO [train.py:1039] (0/4) Epoch 13, batch 2850, loss[loss=0.2162, simple_loss=0.276, pruned_loss=0.07822, over 23776.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2657, pruned_loss=0.06062, over 4725465.46 frames. ], batch size: 164, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:56:37,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:37,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:56:39,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:56:41,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:56:44,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:56:45,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:56:46,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:50,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:50,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:51,311 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.46 vs. limit=15.0 2023-09-29 17:56:52,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:56:52,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 17:56:56,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=444033.3333333333, ans=0.0 2023-09-29 17:57:00,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 17:57:00,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:02,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 17:57:02,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:02,783 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=444033.3333333333, ans=0.0 2023-09-29 17:57:04,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 17:57:05,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 17:57:05,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=444033.3333333333, ans=0.0 2023-09-29 17:57:07,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:20,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:22,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:22,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:57:23,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:57:23,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:57:23,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:57:25,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:57:25,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 17:57:27,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:57:27,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:57:29,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:30,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:30,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=444166.6666666667, ans=0.05 2023-09-29 17:57:33,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:33,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:35,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=444166.6666666667, ans=0.2 2023-09-29 17:57:37,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:38,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:40,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:57:40,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:42,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:45,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:57:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:57:50,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 17:57:50,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 17:57:50,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=444233.3333333333, ans=0.125 2023-09-29 17:57:52,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:57:53,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:53,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 17:57:55,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:57:55,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:55,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:57:55,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:57:55,180 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 17:57:55,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=444233.3333333333, ans=0.0 2023-09-29 17:57:56,638 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 17:57:56,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:57:56,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:59,674 INFO [train.py:1039] (0/4) Epoch 13, batch 2900, loss[loss=0.1866, simple_loss=0.2626, pruned_loss=0.05534, over 23625.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2655, pruned_loss=0.06068, over 4704648.22 frames. ], batch size: 149, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:58:01,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:02,695 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.932e+02 2.253e+02 2.547e+02 3.848e+02, threshold=4.506e+02, percent-clipped=0.0 2023-09-29 17:58:03,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:03,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:04,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 17:58:10,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:10,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 17:58:11,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 17:58:13,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:58:13,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:58:16,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:16,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:58:20,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:58:21,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:24,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:58:24,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 17:58:24,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:58:26,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:29,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 17:58:30,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 17:58:31,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=444433.3333333333, ans=0.125 2023-09-29 17:58:33,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:58:33,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 17:58:34,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:58:37,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:58:37,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:42,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:42,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:44,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=444433.3333333333, ans=0.125 2023-09-29 17:58:45,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:49,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:51,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=444500.0, ans=0.125 2023-09-29 17:58:53,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 17:58:53,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 17:58:53,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:58:56,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:58:59,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 17:58:59,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:59:05,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:59:07,904 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.13 vs. limit=15.0 2023-09-29 17:59:13,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:59:13,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:59:15,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 17:59:18,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:18,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 17:59:20,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:22,149 INFO [train.py:1039] (0/4) Epoch 13, batch 2950, loss[loss=0.1914, simple_loss=0.2732, pruned_loss=0.05475, over 24440.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2664, pruned_loss=0.06108, over 4702767.27 frames. ], batch size: 69, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:59:22,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:59:25,330 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.41 vs. limit=5.0 2023-09-29 17:59:26,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=444633.3333333333, ans=0.2 2023-09-29 17:59:29,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:30,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 17:59:31,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:31,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:34,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:59:34,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:59:35,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 17:59:37,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 17:59:37,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:59:37,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:42,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:59:43,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:59:45,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:59:47,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:59:49,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:59:49,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:59:50,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:59:54,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 18:00:01,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 18:00:01,552 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 18:00:02,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:00:05,789 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 18:00:05,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 18:00:07,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:00:07,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:00:07,581 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 18:00:07,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:00:10,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 18:00:12,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:00:12,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:00:16,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:18,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:00:18,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:19,632 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 18:00:19,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:19,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 18:00:26,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:27,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:00:28,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 18:00:29,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:00:31,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 18:00:35,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:36,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=444900.0, ans=0.125 2023-09-29 18:00:37,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:00:38,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:00:38,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:38,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:00:40,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:00:40,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:00:40,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:00:42,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:00:42,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:43,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:00:45,226 INFO [train.py:1039] (0/4) Epoch 13, batch 3000, loss[loss=0.1603, simple_loss=0.2325, pruned_loss=0.04407, over 24448.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2668, pruned_loss=0.06113, over 4710859.19 frames. ], batch size: 58, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 18:00:45,227 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 18:01:00,595 INFO [train.py:1071] (0/4) Epoch 13, validation: loss=0.3476, simple_loss=0.2869, pruned_loss=0.2041, over 1125622.00 frames. 2023-09-29 18:01:00,595 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-29 18:01:00,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:00,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 18:01:02,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:04,370 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.886e+02 2.154e+02 2.482e+02 3.380e+02, threshold=4.309e+02, percent-clipped=0.0 2023-09-29 18:01:04,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:06,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:01:09,666 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 18:01:09,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 18:01:11,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:01:11,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:01:12,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 18:01:14,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:21,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:01:30,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:01:40,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 18:01:41,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:01:44,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:01:45,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:45,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:01:47,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:01:47,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 18:01:49,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 18:01:50,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:01:50,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:01:53,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:01:53,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:01:55,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:55,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:01:59,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:02:01,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:02:01,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:02:02,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:02:03,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=445166.6666666667, ans=0.125 2023-09-29 18:02:04,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 18:02:05,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:02:05,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:07,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:02:09,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:09,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:11,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:02:11,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 18:02:12,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:02:12,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 18:02:12,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:02:16,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 18:02:19,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:02:19,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=445233.3333333333, ans=0.125 2023-09-29 18:02:20,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:02:21,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 18:02:22,402 INFO [train.py:1039] (0/4) Epoch 13, batch 3050, loss[loss=0.1659, simple_loss=0.2436, pruned_loss=0.04408, over 21077.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2668, pruned_loss=0.0613, over 4711078.42 frames. ], batch size: 46, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:02:23,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 18:02:23,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:02:24,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:02:25,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:25,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:02:26,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:27,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:02:29,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 18:02:30,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:02:33,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:33,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:02:38,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:39,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 18:02:45,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 18:02:45,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 18:02:47,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:02:47,427 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=445366.6666666667, ans=0.0 2023-09-29 18:02:48,162 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.11 vs. limit=15.0 2023-09-29 18:02:52,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:02:55,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:55,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:56,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:02:57,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=445433.3333333333, ans=0.1 2023-09-29 18:02:59,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:01,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:03:01,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:01,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=445433.3333333333, ans=0.1 2023-09-29 18:03:02,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:03:02,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:02,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:06,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:09,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:09,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 18:03:09,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:09,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:03:12,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:03:13,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:03:14,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:14,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:17,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=445500.0, ans=0.2 2023-09-29 18:03:19,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:20,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:25,276 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.44 vs. limit=15.0 2023-09-29 18:03:26,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:28,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:03:28,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:28,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:29,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:03:31,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:31,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 18:03:32,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:34,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:34,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 18:03:36,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:42,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:42,822 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=14.55 vs. limit=15.0 2023-09-29 18:03:43,545 INFO [train.py:1039] (0/4) Epoch 13, batch 3100, loss[loss=0.2258, simple_loss=0.2832, pruned_loss=0.08421, over 23752.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2659, pruned_loss=0.06076, over 4711613.11 frames. ], batch size: 164, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:03:43,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:03:45,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:03:46,662 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.826e+02 2.024e+02 2.314e+02 3.606e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-29 18:03:46,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 18:03:50,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 18:03:51,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 18:03:52,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:03:52,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=445633.3333333333, ans=0.125 2023-09-29 18:03:56,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:57,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:59,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:04:02,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:07,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 18:04:09,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=445700.0, ans=0.035 2023-09-29 18:04:13,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:04:13,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:14,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:14,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:04:15,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:04:15,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=445766.6666666667, ans=0.2 2023-09-29 18:04:16,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:04:16,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 18:04:16,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:04:18,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=445766.6666666667, ans=0.125 2023-09-29 18:04:20,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:20,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 18:04:21,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:04:22,187 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=445766.6666666667, ans=0.125 2023-09-29 18:04:25,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:04:27,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 18:04:29,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 18:04:29,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:30,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:33,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:33,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:33,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:04:35,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:04:35,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:04:37,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:04:37,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:04:37,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:37,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:04:37,343 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=445833.3333333333, ans=0.125 2023-09-29 18:04:39,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=445833.3333333333, ans=0.0 2023-09-29 18:04:39,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=445833.3333333333, ans=0.95 2023-09-29 18:04:41,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:41,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 18:04:43,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:04:45,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 18:04:45,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:45,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:46,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 18:04:59,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 18:05:03,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:04,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:06,264 INFO [train.py:1039] (0/4) Epoch 13, batch 3150, loss[loss=0.1807, simple_loss=0.2616, pruned_loss=0.04995, over 24633.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2645, pruned_loss=0.06053, over 4698489.86 frames. ], batch size: 65, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:05:06,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:05:06,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:05:08,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 18:05:09,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:09,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:05:11,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 18:05:14,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:15,735 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 18:05:18,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 18:05:18,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:05:20,458 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 18:05:23,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:05:24,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 18:05:25,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 18:05:25,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 18:05:25,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:25,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:27,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:28,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 18:05:30,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:34,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:05:39,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 18:05:40,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:05:41,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:05:43,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:44,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 18:05:45,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=446100.0, ans=0.125 2023-09-29 18:05:47,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 18:05:48,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:05:49,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:05:49,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:05:49,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:49,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:05:51,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:05:51,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:05:52,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 18:05:52,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:05:52,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:05:55,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:05:55,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:57,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 18:05:57,291 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=446166.6666666667, ans=0.125 2023-09-29 18:05:58,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:00,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 18:06:00,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:02,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 18:06:03,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 18:06:04,245 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=446166.6666666667, ans=0.1 2023-09-29 18:06:05,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:06:05,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:05,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 18:06:07,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 18:06:09,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:06:09,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=446166.6666666667, ans=0.2 2023-09-29 18:06:12,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:06:14,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:14,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:06:18,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:06:19,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:21,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 18:06:24,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:06:24,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 18:06:29,121 INFO [train.py:1039] (0/4) Epoch 13, batch 3200, loss[loss=0.1958, simple_loss=0.2629, pruned_loss=0.06436, over 23700.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2635, pruned_loss=0.0605, over 4710273.86 frames. ], batch size: 212, lr: 7.95e-03, grad_scale: 32.0 2023-09-29 18:06:29,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:29,484 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=446300.0, ans=0.125 2023-09-29 18:06:29,526 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=446300.0, ans=0.125 2023-09-29 18:06:30,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:06:30,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 18:06:32,602 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.906e+02 2.221e+02 2.638e+02 3.823e+02, threshold=4.442e+02, percent-clipped=0.0 2023-09-29 18:06:34,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:39,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:06:44,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:51,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=446366.6666666667, ans=0.0 2023-09-29 18:06:55,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:07:05,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 18:07:05,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:07:08,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 18:07:10,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:07:14,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:07:14,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:07:15,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:07:20,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 18:07:20,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:07:23,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 18:07:26,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 18:07:29,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:07:36,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:36,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:07:36,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:37,851 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.86 vs. limit=15.0 2023-09-29 18:07:38,308 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 18:07:38,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:07:41,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:07:43,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 18:07:43,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 18:07:45,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 18:07:47,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 18:07:49,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:07:52,558 INFO [train.py:1039] (0/4) Epoch 13, batch 3250, loss[loss=0.1996, simple_loss=0.2648, pruned_loss=0.06718, over 23278.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.265, pruned_loss=0.06056, over 4715196.08 frames. ], batch size: 119, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:07:52,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:07:52,700 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 18:07:52,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:07:52,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:07:54,244 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 18:07:59,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:07:59,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=446633.3333333333, ans=0.125 2023-09-29 18:08:02,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:09,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:08:09,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 18:08:10,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:12,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:08:12,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:13,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:14,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:08:17,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:08:17,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:17,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:19,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:08:21,384 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.96 vs. limit=6.0 2023-09-29 18:08:21,761 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.89 vs. limit=15.0 2023-09-29 18:08:22,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:24,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:25,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=446766.6666666667, ans=0.0 2023-09-29 18:08:26,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:26,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:28,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:28,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:28,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:08:33,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 18:08:34,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:34,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:08:36,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:36,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:08:44,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:08:48,196 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=446833.3333333333, ans=0.1 2023-09-29 18:08:53,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:08:54,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:54,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 18:08:54,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:08:54,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:08:54,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:59,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 18:08:59,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 18:09:00,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:09:02,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:03,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:03,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:09:03,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:06,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:06,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:08,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 18:09:08,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:11,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:09:11,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 18:09:14,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:09:14,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 18:09:14,936 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=446966.6666666667, ans=0.0 2023-09-29 18:09:16,069 INFO [train.py:1039] (0/4) Epoch 13, batch 3300, loss[loss=0.1551, simple_loss=0.2281, pruned_loss=0.04105, over 18456.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2653, pruned_loss=0.0602, over 4713678.18 frames. ], batch size: 40, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:09:16,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 18:09:18,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 18:09:18,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:21,251 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.922e+02 2.153e+02 2.771e+02 4.428e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 18:09:22,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:24,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:09:24,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:26,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:09:27,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:09:29,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:31,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:35,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 18:09:37,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:09:37,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:40,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:40,306 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 18:09:41,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:09:43,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:09:43,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:09:43,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:09:43,334 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 18:09:47,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:49,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:09:52,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:52,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 18:09:54,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:09:54,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:55,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:09:57,508 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 18:09:59,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 18:09:59,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:09:59,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=447100.0, ans=0.125 2023-09-29 18:10:02,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 18:10:05,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:08,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:10:09,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:11,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:12,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:12,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:10:12,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:10:15,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:10:16,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:17,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:10:18,965 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 18:10:20,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 18:10:22,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:10:23,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:10:23,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:25,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:10:25,585 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=447233.3333333333, ans=0.125 2023-09-29 18:10:27,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:27,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:10:28,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:31,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:10:34,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 18:10:34,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:36,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:36,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:10:36,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:36,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=447300.0, ans=0.2 2023-09-29 18:10:38,577 INFO [train.py:1039] (0/4) Epoch 13, batch 3350, loss[loss=0.187, simple_loss=0.2591, pruned_loss=0.05744, over 20556.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2661, pruned_loss=0.0609, over 4721753.10 frames. ], batch size: 44, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:10:38,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:41,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:41,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:46,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:50,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:51,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:10:53,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:55,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:10:56,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:58,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:10:59,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 18:11:01,169 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 18:11:02,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:11:04,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 18:11:05,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 18:11:06,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:11:06,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:11:07,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:08,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 18:11:08,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:09,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:11:11,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:12,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:14,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:14,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:11:18,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:21,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:21,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:23,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=447433.3333333333, ans=0.125 2023-09-29 18:11:23,816 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=447433.3333333333, ans=0.125 2023-09-29 18:11:26,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:11:28,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:29,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:29,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:31,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:33,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 18:11:33,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:11:33,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 18:11:34,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:11:34,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 18:11:36,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:36,555 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:11:37,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:44,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:46,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 18:11:46,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:11:48,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:11:50,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:11:57,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:11:58,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 18:12:00,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:12:01,650 INFO [train.py:1039] (0/4) Epoch 13, batch 3400, loss[loss=0.217, simple_loss=0.2766, pruned_loss=0.07872, over 22560.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.266, pruned_loss=0.06042, over 4731018.27 frames. ], batch size: 322, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:12:01,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:12:03,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:03,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 18:12:03,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:03,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 18:12:06,367 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 1.928e+02 2.132e+02 2.448e+02 3.305e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 18:12:06,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:06,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:06,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=447633.3333333333, ans=0.125 2023-09-29 18:12:08,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:12:08,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:12:08,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 18:12:13,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 18:12:13,287 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 18:12:13,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:18,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:12:18,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:12:20,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:20,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:12:25,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:28,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 18:12:34,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:12:37,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:37,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:38,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 18:12:41,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.03 vs. limit=22.5 2023-09-29 18:12:43,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:12:45,561 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.33 vs. limit=22.5 2023-09-29 18:12:48,315 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=447766.6666666667, ans=0.07 2023-09-29 18:12:49,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 18:12:54,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:54,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:56,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 18:12:56,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:56,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:58,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:58,828 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=447833.3333333333, ans=0.0 2023-09-29 18:12:59,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:13:02,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:13:06,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:13:06,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:13:11,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:14,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 18:13:17,068 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.68 vs. limit=15.0 2023-09-29 18:13:19,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:13:19,783 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.96 vs. limit=15.0 2023-09-29 18:13:23,691 INFO [train.py:1039] (0/4) Epoch 13, batch 3450, loss[loss=0.1962, simple_loss=0.2802, pruned_loss=0.05614, over 23988.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.266, pruned_loss=0.06092, over 4729654.71 frames. ], batch size: 80, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:13:23,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 18:13:28,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 18:13:28,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:13:30,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:13:30,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 18:13:32,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:37,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:13:40,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:13:41,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:13:43,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:13:43,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:45,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:52,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 18:13:58,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 18:13:58,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:14:00,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:14:00,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:09,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 18:14:09,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:14:12,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:12,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:14:15,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:14:16,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:14:19,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 18:14:19,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:19,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:14:22,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:14:25,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 18:14:26,937 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.26 vs. limit=6.0 2023-09-29 18:14:28,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:14:33,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:14:34,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:35,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=448233.3333333333, ans=0.1 2023-09-29 18:14:35,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=448233.3333333333, ans=0.1 2023-09-29 18:14:38,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:38,954 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=15.0 2023-09-29 18:14:42,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:42,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:44,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:14:44,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:48,033 INFO [train.py:1039] (0/4) Epoch 13, batch 3500, loss[loss=0.2057, simple_loss=0.2714, pruned_loss=0.07003, over 23659.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2652, pruned_loss=0.06053, over 4722555.79 frames. ], batch size: 149, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:14:49,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:52,616 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.904e+02 2.170e+02 2.519e+02 3.488e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-29 18:14:52,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:14:53,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=448300.0, ans=0.0 2023-09-29 18:14:54,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 18:14:55,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:14:59,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:15:02,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:15:02,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 18:15:08,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:15:09,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:15:09,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:15:09,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:09,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:15:11,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:12,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:12,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 18:15:15,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:15,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:15:17,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:20,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:22,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 18:15:22,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:25,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:27,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:15:28,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:30,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:15:31,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:33,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 18:15:33,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 18:15:34,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 18:15:34,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:36,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=448500.0, ans=0.0 2023-09-29 18:15:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:37,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:37,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:15:41,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:15:41,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:15:47,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:15:49,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 18:15:49,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 18:15:49,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:15:49,878 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.52 vs. limit=6.0 2023-09-29 18:15:52,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:15:52,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:15:54,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:57,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 18:15:57,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:16:00,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:16:01,236 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.56 vs. limit=15.0 2023-09-29 18:16:01,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 18:16:03,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 18:16:05,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:05,610 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:16:06,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:16:06,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:06,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:09,899 INFO [train.py:1039] (0/4) Epoch 13, batch 3550, loss[loss=0.2006, simple_loss=0.2637, pruned_loss=0.06873, over 23708.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2633, pruned_loss=0.05996, over 4724211.99 frames. ], batch size: 150, lr: 7.92e-03, grad_scale: 16.0 2023-09-29 18:16:10,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:16:21,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:22,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 18:16:26,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:16:27,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:16:29,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:31,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:16:31,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:16:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:35,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:16:35,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:35,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:16:37,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:16:43,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:16:43,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:45,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:16:45,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:45,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:16:46,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 18:16:46,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:49,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:51,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:16:57,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:58,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:16:58,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=448833.3333333333, ans=0.0 2023-09-29 18:17:00,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:01,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=448833.3333333333, ans=0.0 2023-09-29 18:17:02,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 18:17:02,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:17:02,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 18:17:03,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:17:05,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:17:05,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:17:08,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 18:17:10,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:13,504 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=448833.3333333333, ans=0.0 2023-09-29 18:17:16,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:16,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 18:17:18,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:21,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:17:23,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 18:17:29,580 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.50 vs. limit=15.0 2023-09-29 18:17:30,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 18:17:30,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:17:31,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:17:33,851 INFO [train.py:1039] (0/4) Epoch 13, batch 3600, loss[loss=0.205, simple_loss=0.2759, pruned_loss=0.06703, over 23259.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2633, pruned_loss=0.06009, over 4716798.93 frames. ], batch size: 119, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:17:35,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:17:39,125 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.817e+02 2.056e+02 2.414e+02 4.361e+02, threshold=4.112e+02, percent-clipped=1.0 2023-09-29 18:17:40,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:43,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:44,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:17:45,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:17:45,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:45,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 18:17:47,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=448966.6666666667, ans=0.125 2023-09-29 18:17:48,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:17:48,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:51,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:17:55,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:17:56,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:17:56,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:58,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 18:17:58,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:18:01,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:18:01,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:18:03,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:07,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:18:07,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:08,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 18:18:16,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:18,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:18:18,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 18:18:23,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:18:26,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=449166.6666666667, ans=0.125 2023-09-29 18:18:28,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:31,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:37,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:18:37,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:18:37,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 18:18:38,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 18:18:40,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 18:18:42,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:44,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:18:45,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 18:18:45,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:18:47,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:18:47,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:47,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 18:18:49,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 18:18:52,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:52,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=449233.3333333333, ans=0.2 2023-09-29 18:18:53,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 18:18:54,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=449233.3333333333, ans=0.0 2023-09-29 18:18:56,388 INFO [train.py:1039] (0/4) Epoch 13, batch 3650, loss[loss=0.1791, simple_loss=0.2489, pruned_loss=0.05465, over 24319.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.2643, pruned_loss=0.06052, over 4711169.93 frames. ], batch size: 56, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:18:58,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 18:18:59,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:19:03,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 18:19:05,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 18:19:09,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=449300.0, ans=0.1 2023-09-29 18:19:11,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:11,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:19:13,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:19:14,376 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.44 vs. limit=10.0 2023-09-29 18:19:16,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:19:16,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:19:16,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 18:19:18,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:19:18,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:20,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 18:19:21,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:19:21,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:19:22,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:24,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:19:28,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 18:19:28,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 18:19:29,435 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.05 vs. limit=15.0 2023-09-29 18:19:29,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:19:31,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 18:19:33,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:33,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:19:38,769 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.28 vs. limit=15.0 2023-09-29 18:19:39,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:19:41,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:41,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:19:43,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:19:43,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:19:45,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:19:48,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:51,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:53,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:19:54,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:56,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:19:59,195 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=449500.0, ans=0.125 2023-09-29 18:20:02,044 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 18:20:05,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:05,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:06,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:20:06,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:08,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:20:08,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=449566.6666666667, ans=0.2 2023-09-29 18:20:09,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:11,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 18:20:11,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:15,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:20:16,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:20:17,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:20:19,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=449633.3333333333, ans=0.125 2023-09-29 18:20:20,446 INFO [train.py:1039] (0/4) Epoch 13, batch 3700, loss[loss=0.1978, simple_loss=0.2685, pruned_loss=0.06352, over 23671.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2646, pruned_loss=0.06021, over 4716977.05 frames. ], batch size: 149, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:20:20,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:20,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 18:20:20,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:22,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:20:22,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:20:25,571 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.943e+02 2.154e+02 2.473e+02 4.046e+02, threshold=4.307e+02, percent-clipped=0.0 2023-09-29 18:20:25,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:20:30,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:32,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:33,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:20:33,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:35,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:20:38,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:39,763 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 18:20:49,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:20:49,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:20:50,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:20:50,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 18:20:51,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:20:54,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:56,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 18:20:56,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=449766.6666666667, ans=0.125 2023-09-29 18:20:57,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:58,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:21:01,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:02,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:21:04,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:21:08,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:21:08,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 18:21:09,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:21:10,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 18:21:16,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:21:17,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:21:20,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:20,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 18:21:23,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:21:23,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:21:23,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:23,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:26,189 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.98 vs. limit=22.5 2023-09-29 18:21:27,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:29,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 18:21:31,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 18:21:31,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:21:31,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:32,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:21:34,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:21:37,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:39,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:21:40,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:21:42,137 INFO [train.py:1039] (0/4) Epoch 13, batch 3750, loss[loss=0.1875, simple_loss=0.266, pruned_loss=0.0545, over 23897.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2657, pruned_loss=0.06058, over 4719732.94 frames. ], batch size: 86, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:21:42,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 18:21:44,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:21:45,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:21:47,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 18:21:47,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:21:49,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:50,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:53,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:21:55,068 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:21:57,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:01,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:22:01,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:22:04,225 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.25 vs. limit=15.0 2023-09-29 18:22:04,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:22:08,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:08,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 18:22:09,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=450033.3333333333, ans=0.125 2023-09-29 18:22:10,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:12,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:12,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:13,978 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=450100.0, ans=0.125 2023-09-29 18:22:15,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 18:22:19,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 18:22:20,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:21,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:23,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:29,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:31,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:22:34,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 18:22:39,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:41,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=450166.6666666667, ans=0.125 2023-09-29 18:22:43,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:22:44,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:22:47,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:22:51,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:22:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:22:52,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=450233.3333333333, ans=0.1 2023-09-29 18:22:55,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:22:57,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:22:59,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:23:03,487 INFO [train.py:1039] (0/4) Epoch 13, batch 3800, loss[loss=0.207, simple_loss=0.2755, pruned_loss=0.06927, over 23378.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2652, pruned_loss=0.06057, over 4729318.69 frames. ], batch size: 119, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:23:06,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:23:08,363 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 1.938e+02 2.125e+02 2.387e+02 3.006e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 18:23:12,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:12,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:23:13,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 18:23:13,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:17,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:19,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:23:22,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:23:22,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:23,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:23:25,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:25,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:23:25,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:26,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 18:23:29,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 18:23:31,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:23:36,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:39,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:23:39,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:23:41,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:23:41,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:42,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:43,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:48,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:23:48,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 18:23:51,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:23:53,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=450500.0, ans=0.05 2023-09-29 18:23:57,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:23:59,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=450500.0, ans=0.0 2023-09-29 18:24:01,486 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.39 vs. limit=15.0 2023-09-29 18:24:03,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:05,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 18:24:06,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 18:24:07,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:07,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=450566.6666666667, ans=0.125 2023-09-29 18:24:08,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:24:10,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:10,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 18:24:10,932 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.640e-03 2023-09-29 18:24:13,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 18:24:13,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 18:24:15,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:15,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:24:15,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=450566.6666666667, ans=0.2 2023-09-29 18:24:23,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:24:24,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:24:25,914 INFO [train.py:1039] (0/4) Epoch 13, batch 3850, loss[loss=0.1919, simple_loss=0.2611, pruned_loss=0.06133, over 24653.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2647, pruned_loss=0.06046, over 4712839.38 frames. ], batch size: 65, lr: 7.91e-03, grad_scale: 16.0 2023-09-29 18:24:27,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:24:29,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 18:24:29,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:24:30,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:35,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:24:37,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:40,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:24:42,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 18:24:47,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=450700.0, ans=0.0 2023-09-29 18:24:48,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:50,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:52,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:24:53,152 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.05 vs. limit=12.0 2023-09-29 18:24:53,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:24:55,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:57,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:57,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:57,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:24:58,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:01,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:03,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:03,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:25:03,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 18:25:03,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 18:25:04,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:05,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:09,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:09,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:10,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 18:25:11,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 18:25:13,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:16,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 18:25:16,637 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=450833.3333333333, ans=0.2 2023-09-29 18:25:19,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:25:22,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:24,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:29,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:29,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 18:25:32,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 18:25:35,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:35,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:38,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:25:38,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:25:40,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:25:40,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 18:25:42,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:43,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 18:25:43,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:43,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:47,078 INFO [train.py:1039] (0/4) Epoch 13, batch 3900, loss[loss=0.2055, simple_loss=0.2783, pruned_loss=0.06634, over 23681.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2636, pruned_loss=0.05997, over 4701020.04 frames. ], batch size: 85, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:25:47,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:25:47,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:48,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:25:50,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:50,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:51,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:25:51,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 18:25:53,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:54,605 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.889e+02 2.168e+02 2.543e+02 3.582e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 18:25:55,065 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=450966.6666666667, ans=0.2 2023-09-29 18:25:56,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:25:57,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:25:57,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:25:57,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:26:00,179 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:26:00,853 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=15.0 2023-09-29 18:26:03,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:26:04,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:06,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:26:07,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 18:26:07,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:09,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 18:26:11,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:11,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 18:26:12,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 18:26:18,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:20,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:26:20,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:26:22,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:26:25,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:26,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:26:29,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:26:29,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:26:31,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:26:36,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:36,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:26:42,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:26:44,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:26:51,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=451233.3333333333, ans=0.125 2023-09-29 18:26:55,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=451233.3333333333, ans=0.0 2023-09-29 18:26:57,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:26:59,038 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=451233.3333333333, ans=0.1 2023-09-29 18:27:00,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:00,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 18:27:00,531 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=451233.3333333333, ans=0.0 2023-09-29 18:27:01,143 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.32 vs. limit=15.0 2023-09-29 18:27:01,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 18:27:01,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:02,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 18:27:03,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:27:04,166 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.19 vs. limit=10.0 2023-09-29 18:27:06,298 INFO [train.py:1039] (0/4) Epoch 13, batch 3950, loss[loss=0.2028, simple_loss=0.2832, pruned_loss=0.06124, over 23986.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2631, pruned_loss=0.06037, over 4692415.80 frames. ], batch size: 80, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:27:06,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 18:27:12,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:27:14,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 18:27:14,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=451300.0, ans=0.5 2023-09-29 18:27:15,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:27:17,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:27:17,484 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=451300.0, ans=0.125 2023-09-29 18:27:18,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:27:23,661 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 18:27:25,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:25,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 18:27:25,817 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 18:27:25,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:27:28,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:29,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:27:29,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:30,931 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=451366.6666666667, ans=0.2 2023-09-29 18:27:32,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 18:27:35,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:27:35,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:35,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:27:36,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:27:36,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:27:46,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=451433.3333333333, ans=0.125 2023-09-29 18:27:50,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:27:50,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:27:57,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 18:28:05,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 18:28:05,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 18:28:05,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:28:05,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:28:13,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:28:13,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:28:13,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:28:13,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:28:14,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 18:28:20,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:28:22,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:28:26,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 18:28:30,230 INFO [train.py:1039] (0/4) Epoch 13, batch 4000, loss[loss=0.2618, simple_loss=0.3094, pruned_loss=0.1071, over 19337.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2638, pruned_loss=0.06066, over 4689504.52 frames. ], batch size: 388, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:28:30,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=451633.3333333333, ans=0.125 2023-09-29 18:28:37,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:37,586 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=451633.3333333333, ans=0.125 2023-09-29 18:28:38,700 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.910e+02 2.120e+02 2.727e+02 3.930e+02, threshold=4.239e+02, percent-clipped=0.0 2023-09-29 18:28:43,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:48,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:49,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:28:49,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:51,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 18:28:51,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:28:53,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 18:28:53,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:28:53,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 18:28:55,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:56,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=451700.0, ans=0.2 2023-09-29 18:28:58,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:28:58,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:28:58,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:29:00,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:00,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:29:01,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:29:03,316 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 18:29:03,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=451766.6666666667, ans=0.0 2023-09-29 18:29:04,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:29:04,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:07,964 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 18:29:09,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:29:09,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:16,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 18:29:16,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=451766.6666666667, ans=0.0 2023-09-29 18:29:17,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:19,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:29:20,932 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 18:29:23,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:29:23,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 18:29:25,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:29:27,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:27,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:29:28,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:29:30,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:29:30,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:32,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 18:29:33,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:35,568 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 18:29:40,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:29:43,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:29:46,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:29:46,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:48,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:29:48,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:29:51,471 INFO [train.py:1039] (0/4) Epoch 13, batch 4050, loss[loss=0.1892, simple_loss=0.2584, pruned_loss=0.05999, over 23257.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.265, pruned_loss=0.06091, over 4702490.70 frames. ], batch size: 119, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:29:54,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:56,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:29:56,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 18:29:59,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:29:59,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:01,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:30:03,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:03,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=451966.6666666667, ans=0.1 2023-09-29 18:30:04,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:08,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:11,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:13,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:30:13,447 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=452033.3333333333, ans=0.2 2023-09-29 18:30:14,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:30:16,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:30:18,037 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=452033.3333333333, ans=0.0 2023-09-29 18:30:19,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:20,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:22,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 18:30:26,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 18:30:26,223 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 18:30:28,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:30:30,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=452100.0, ans=0.5 2023-09-29 18:30:33,004 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=452100.0, ans=0.125 2023-09-29 18:30:35,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 18:30:37,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:30:40,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:43,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:44,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:30:44,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:46,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:46,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=452166.6666666667, ans=0.0 2023-09-29 18:30:49,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=452166.6666666667, ans=0.0 2023-09-29 18:30:50,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 18:30:50,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:30:51,130 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=452166.6666666667, ans=0.1 2023-09-29 18:30:52,463 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:30:54,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 18:30:58,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:31:07,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 18:31:07,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:07,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:31:10,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 18:31:12,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 18:31:12,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:14,157 INFO [train.py:1039] (0/4) Epoch 13, batch 4100, loss[loss=0.1867, simple_loss=0.257, pruned_loss=0.05826, over 23713.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2659, pruned_loss=0.06096, over 4713369.50 frames. ], batch size: 149, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:31:14,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:14,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:16,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:31:22,213 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.980e+02 2.231e+02 2.743e+02 3.910e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 18:31:22,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 18:31:25,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 18:31:27,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 18:31:28,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 18:31:28,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:28,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=452366.6666666667, ans=0.0 2023-09-29 18:31:29,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:31:30,185 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 18:31:33,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:34,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:31:34,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:38,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:31:39,032 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-09-29 18:31:41,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:31:43,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:43,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:31:43,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 18:31:43,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:43,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:31:43,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:45,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:31:45,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 18:31:48,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:31:48,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=452433.3333333333, ans=0.2 2023-09-29 18:31:50,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 18:31:52,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:55,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:55,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 18:31:56,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:57,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:31:58,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:32:00,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 18:32:00,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:32:01,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:32:04,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 18:32:04,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:06,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:09,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:15,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:15,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=452500.0, ans=0.125 2023-09-29 18:32:16,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:19,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:32:27,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:27,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:31,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:32,236 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:32:34,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:32:36,418 INFO [train.py:1039] (0/4) Epoch 13, batch 4150, loss[loss=0.2073, simple_loss=0.2846, pruned_loss=0.06495, over 23907.00 frames. ], tot_loss[loss=0.1945, simple_loss=0.2666, pruned_loss=0.06122, over 4726139.31 frames. ], batch size: 86, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:32:38,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:38,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:32:39,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:32:39,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:32:41,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 18:32:42,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:44,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 18:32:46,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 18:32:46,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 18:32:48,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:53,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:32:53,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:57,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:59,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:00,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:33:02,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:33:02,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:33:03,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:33:08,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:13,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:14,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 18:33:16,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 18:33:16,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:33:18,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 18:33:18,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:33:18,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:21,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:21,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:25,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 18:33:30,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:31,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:33:31,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 18:33:33,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:33,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 18:33:37,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:33:39,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:40,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:42,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 18:33:42,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:42,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:33:43,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:33:45,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 18:33:45,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:45,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:33:47,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:33:48,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 18:33:48,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:48,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:33:50,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:51,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:51,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 18:33:51,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:52,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=452900.0, ans=0.1 2023-09-29 18:33:59,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:34:00,817 INFO [train.py:1039] (0/4) Epoch 13, batch 4200, loss[loss=0.1718, simple_loss=0.2468, pruned_loss=0.04842, over 24541.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2656, pruned_loss=0.06127, over 4726156.75 frames. ], batch size: 60, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:34:01,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 18:34:04,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:34:06,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:07,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:34:09,027 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.933e+02 2.202e+02 2.518e+02 4.955e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-29 18:34:09,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:09,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:10,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 18:34:15,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 18:34:15,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:17,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:20,945 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.58 vs. limit=15.0 2023-09-29 18:34:21,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:34:24,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:34:26,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:34:26,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:28,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 18:34:28,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:28,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:29,299 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.50 vs. limit=12.0 2023-09-29 18:34:30,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:30,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:34:31,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:34:35,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 18:34:35,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:39,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:34:41,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:34:44,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:34:44,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:34:47,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:34:47,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 18:34:47,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:34:49,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:34:55,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:34:56,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:35:00,833 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=453166.6666666667, ans=0.125 2023-09-29 18:35:02,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:35:02,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=453166.6666666667, ans=0.05 2023-09-29 18:35:05,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 18:35:07,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:35:12,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:35:14,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:17,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 18:35:22,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:35:22,617 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=453300.0, ans=0.125 2023-09-29 18:35:23,587 INFO [train.py:1039] (0/4) Epoch 13, batch 4250, loss[loss=0.1839, simple_loss=0.2509, pruned_loss=0.05839, over 23858.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2643, pruned_loss=0.06069, over 4724990.98 frames. ], batch size: 195, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:35:25,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:35:25,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:35:27,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:30,473 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-68000.pt 2023-09-29 18:35:36,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:35:37,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 18:35:37,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:35:40,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:43,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:35:49,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:49,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:51,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:35:51,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:35:52,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:52,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:54,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:57,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:35:58,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:35:59,380 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.31 vs. limit=15.0 2023-09-29 18:36:00,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 18:36:03,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 18:36:03,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:04,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:04,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:36:06,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:36:06,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:06,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:06,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=453433.3333333333, ans=0.125 2023-09-29 18:36:11,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:36:11,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:36:18,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:18,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:20,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 18:36:20,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:36:22,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 18:36:24,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:36:25,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:36:28,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:28,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:36:28,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 18:36:30,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:36:31,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:36:32,447 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=453566.6666666667, ans=0.0 2023-09-29 18:36:34,284 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.76 vs. limit=15.0 2023-09-29 18:36:36,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:38,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:39,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:36:42,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:42,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:43,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:36:45,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:36:45,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 18:36:46,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:50,404 INFO [train.py:1039] (0/4) Epoch 13, batch 4300, loss[loss=0.1948, simple_loss=0.2618, pruned_loss=0.06393, over 22708.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2637, pruned_loss=0.06069, over 4723309.38 frames. ], batch size: 322, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:36:52,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:52,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:36:57,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:58,705 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 2.045e+02 2.305e+02 2.959e+02 4.581e+02, threshold=4.610e+02, percent-clipped=2.0 2023-09-29 18:37:04,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:37:04,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 18:37:06,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:37:08,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:37:08,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:37:08,206 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 18:37:11,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:37:11,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=453700.0, ans=0.0 2023-09-29 18:37:14,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:19,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 18:37:19,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:37:19,393 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=453700.0, ans=0.1 2023-09-29 18:37:20,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 18:37:21,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=453766.6666666667, ans=0.2 2023-09-29 18:37:22,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:37:23,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:37:24,822 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=453766.6666666667, ans=0.125 2023-09-29 18:37:29,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:37:29,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:37:31,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:37:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:32,040 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:37:33,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:37:33,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 18:37:34,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 18:37:37,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:37:38,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=453833.3333333333, ans=0.125 2023-09-29 18:37:41,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:41,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:37:41,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:41,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:41,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 18:37:41,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 18:37:42,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 18:37:44,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:37:44,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 18:37:44,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 18:37:47,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:48,983 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 18:37:50,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:37:52,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:37:52,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:54,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 18:37:55,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:55,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:55,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:37:55,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:37:57,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:37:58,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:38:02,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:04,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:04,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:38:09,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 18:38:10,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:38:10,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=453966.6666666667, ans=0.125 2023-09-29 18:38:11,800 INFO [train.py:1039] (0/4) Epoch 13, batch 4350, loss[loss=0.1763, simple_loss=0.2557, pruned_loss=0.0485, over 24500.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2639, pruned_loss=0.06, over 4740510.49 frames. ], batch size: 60, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:38:13,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=453966.6666666667, ans=0.0 2023-09-29 18:38:15,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:18,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:21,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:38:21,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:38:21,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=453966.6666666667, ans=0.125 2023-09-29 18:38:25,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:38:31,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:32,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:38:32,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:38:36,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:38:39,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:38:40,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:38:40,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=454033.3333333333, ans=0.0 2023-09-29 18:38:46,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 18:38:48,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:48,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:50,187 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=454100.0, ans=0.0 2023-09-29 18:38:54,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:57,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 18:39:00,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:02,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:39:08,755 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 18:39:08,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:10,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:39:10,481 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 18:39:12,567 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 18:39:12,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:14,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:16,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:39:16,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:19,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:19,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:39:19,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=454233.3333333333, ans=0.125 2023-09-29 18:39:22,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 18:39:22,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:22,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:22,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:22,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=454233.3333333333, ans=0.125 2023-09-29 18:39:23,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 18:39:25,287 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 18:39:25,293 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 18:39:25,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 18:39:28,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:39:29,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:39:29,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:31,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:39:32,939 INFO [train.py:1039] (0/4) Epoch 13, batch 4400, loss[loss=0.1879, simple_loss=0.2766, pruned_loss=0.04963, over 24461.00 frames. ], tot_loss[loss=0.192, simple_loss=0.2643, pruned_loss=0.05991, over 4745484.20 frames. ], batch size: 69, lr: 7.88e-03, grad_scale: 32.0 2023-09-29 18:39:33,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 18:39:34,753 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=454300.0, ans=0.95 2023-09-29 18:39:36,011 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 18:39:36,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:39,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:39,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:41,136 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.889e+02 2.108e+02 2.568e+02 3.749e+02, threshold=4.217e+02, percent-clipped=0.0 2023-09-29 18:39:42,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:44,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 18:39:44,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 18:39:44,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 18:39:46,055 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 18:39:46,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:39:46,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:48,843 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.13 vs. limit=15.0 2023-09-29 18:39:50,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 18:39:51,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:51,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:51,779 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 18:39:55,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:55,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 18:39:55,381 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 18:39:58,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 18:39:59,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 18:39:59,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 18:39:59,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:00,669 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.84 vs. limit=15.0 2023-09-29 18:40:01,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:03,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=454366.6666666667, ans=0.0 2023-09-29 18:40:04,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 18:40:04,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 18:40:04,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:07,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:40:07,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:09,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:09,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:09,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 18:40:10,778 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 18:40:11,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=454433.3333333333, ans=0.125 2023-09-29 18:40:14,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:24,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:28,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 18:40:31,955 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=454500.0, ans=0.0 2023-09-29 18:40:33,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:40:34,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:35,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=454500.0, ans=0.1 2023-09-29 18:40:37,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:40:37,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 18:40:37,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:40:37,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:40:37,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:40:39,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:40:41,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 18:40:44,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 18:40:45,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 18:40:45,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:45,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 18:40:47,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:40:51,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:40:54,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 18:40:56,115 INFO [train.py:1039] (0/4) Epoch 13, batch 4450, loss[loss=0.2049, simple_loss=0.2737, pruned_loss=0.06804, over 23552.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2656, pruned_loss=0.0606, over 4737694.45 frames. ], batch size: 256, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:40:56,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:58,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:58,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:41:06,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:06,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:41:06,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=454633.3333333333, ans=0.1 2023-09-29 18:41:12,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:13,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:41:16,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:41:17,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:18,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 18:41:18,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:18,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:18,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:41:18,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:41:21,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:41:27,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:28,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:29,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:31,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:31,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:41:33,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=454766.6666666667, ans=0.125 2023-09-29 18:41:35,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:41:37,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 18:41:37,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 18:41:37,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:41:41,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:42,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 18:41:45,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:41:50,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:50,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 18:41:50,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:50,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:41:50,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:41:50,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:53,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:54,121 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.82 vs. limit=22.5 2023-09-29 18:41:55,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:41:57,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 18:41:59,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:42:01,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=454900.0, ans=0.0 2023-09-29 18:42:02,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:04,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:42:05,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:07,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:42:07,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:42:10,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 18:42:14,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:42:17,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:18,812 INFO [train.py:1039] (0/4) Epoch 13, batch 4500, loss[loss=0.2037, simple_loss=0.2846, pruned_loss=0.06143, over 24392.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2663, pruned_loss=0.06023, over 4737058.29 frames. ], batch size: 69, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:42:18,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 18:42:18,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 18:42:20,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:25,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:25,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:26,531 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 2.036e+02 2.219e+02 2.497e+02 4.181e+02, threshold=4.438e+02, percent-clipped=0.0 2023-09-29 18:42:28,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:42:28,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:42:30,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:30,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:42,222 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=455033.3333333333, ans=0.5 2023-09-29 18:42:43,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:43,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:42:48,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:49,408 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.19 vs. limit=22.5 2023-09-29 18:42:49,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:42:50,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=455100.0, ans=0.0 2023-09-29 18:42:51,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:42:57,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:43:02,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:43:07,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:43:11,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:43:12,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 18:43:13,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:13,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:43:17,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:43:17,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 18:43:17,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:43:17,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:22,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:43:22,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:43:26,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:29,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:43:29,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:43:31,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 18:43:32,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 18:43:32,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 18:43:38,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 18:43:40,825 INFO [train.py:1039] (0/4) Epoch 13, batch 4550, loss[loss=0.1898, simple_loss=0.2591, pruned_loss=0.06028, over 23646.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2651, pruned_loss=0.06004, over 4722450.56 frames. ], batch size: 149, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:43:41,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 18:43:44,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:43:46,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:47,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:48,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=455300.0, ans=0.0 2023-09-29 18:43:49,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:43:54,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:43:55,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:58,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:43:58,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:43:58,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:00,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:02,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:44:04,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:07,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 18:44:08,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 18:44:10,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:44:10,964 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.98 vs. limit=15.0 2023-09-29 18:44:11,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 18:44:16,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 18:44:16,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:20,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 18:44:23,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:44:25,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:44:28,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 18:44:31,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:34,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:34,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:37,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:37,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 18:44:39,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 18:44:39,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:44:39,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 18:44:43,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 18:44:43,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:45,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:45,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:47,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:47,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:44:47,246 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=455566.6666666667, ans=0.125 2023-09-29 18:44:48,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:44:48,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=455566.6666666667, ans=0.09899494936611666 2023-09-29 18:44:48,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=455566.6666666667, ans=0.2 2023-09-29 18:44:50,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 18:44:52,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:52,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=455566.6666666667, ans=0.0 2023-09-29 18:44:54,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:44:54,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 18:44:54,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:44:54,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 18:44:57,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:44:57,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:44:59,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=455566.6666666667, ans=0.0 2023-09-29 18:45:00,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:45:00,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:45:01,775 INFO [train.py:1039] (0/4) Epoch 13, batch 4600, loss[loss=0.1871, simple_loss=0.249, pruned_loss=0.06259, over 22748.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2639, pruned_loss=0.06034, over 4703681.01 frames. ], batch size: 322, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:45:01,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:45:03,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:45:03,796 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=455633.3333333333, ans=0.05 2023-09-29 18:45:05,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:45:07,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:07,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:45:07,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=455633.3333333333, ans=0.125 2023-09-29 18:45:10,371 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.841e+02 2.065e+02 2.321e+02 3.867e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-29 18:45:10,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:45:10,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:45:11,294 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=15.0 2023-09-29 18:45:12,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:13,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 18:45:15,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:45:19,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:45:21,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:22,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:30,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 18:45:31,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:34,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:35,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=455766.6666666667, ans=0.0 2023-09-29 18:45:38,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:45:38,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:38,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=455766.6666666667, ans=0.0 2023-09-29 18:45:39,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=455766.6666666667, ans=0.07 2023-09-29 18:45:44,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 18:45:44,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:45:44,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:45:48,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=455766.6666666667, ans=0.2 2023-09-29 18:45:49,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:49,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:45:51,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:45:54,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 18:45:57,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:46:01,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:03,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:05,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:05,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 18:46:06,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:06,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 18:46:06,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:08,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:09,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:09,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:46:11,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:11,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 18:46:13,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 18:46:13,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 18:46:13,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:16,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:16,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:16,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:24,433 INFO [train.py:1039] (0/4) Epoch 13, batch 4650, loss[loss=0.1937, simple_loss=0.2688, pruned_loss=0.05928, over 24659.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2636, pruned_loss=0.05974, over 4709885.05 frames. ], batch size: 65, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:46:24,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:46:28,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:28,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:28,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:46:28,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:30,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:30,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:35,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 18:46:35,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=455966.6666666667, ans=0.2 2023-09-29 18:46:36,193 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=10.01 vs. limit=10.0 2023-09-29 18:46:39,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:46:43,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 18:46:43,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:43,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 18:46:43,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:46:44,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 18:46:44,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 18:46:46,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:46,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:46:49,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:46:50,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:52,428 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 18:46:54,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:55,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 18:46:58,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:58,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:47:00,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 18:47:00,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:04,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:47:09,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:13,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:15,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:19,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:47:20,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 18:47:21,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 18:47:22,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 18:47:22,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 18:47:22,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=456166.6666666667, ans=0.2 2023-09-29 18:47:23,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:29,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:47:29,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:31,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 18:47:31,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:31,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:31,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:47:33,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:47:36,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:47:37,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:39,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:41,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:42,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:47:42,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:47:42,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 18:47:44,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:47:45,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 18:47:46,901 INFO [train.py:1039] (0/4) Epoch 13, batch 4700, loss[loss=0.2056, simple_loss=0.2708, pruned_loss=0.07019, over 23733.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.264, pruned_loss=0.06019, over 4708168.80 frames. ], batch size: 212, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:47:49,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=456300.0, ans=0.125 2023-09-29 18:47:52,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:53,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:54,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:55,240 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.032e+02 2.349e+02 2.827e+02 4.344e+02, threshold=4.699e+02, percent-clipped=1.0 2023-09-29 18:47:55,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:56,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:48:02,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 18:48:02,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 18:48:06,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:08,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:48:08,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:48:11,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:12,126 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=456366.6666666667, ans=0.125 2023-09-29 18:48:19,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:48:20,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:48:21,996 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:48:23,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:48:31,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 18:48:33,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:48:36,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:39,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 18:48:41,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:48:46,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:48:46,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 18:48:48,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:48,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:52,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:52,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:48:54,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 18:48:55,758 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 18:48:55,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:59,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 18:49:01,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:49:04,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 18:49:07,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:49:07,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:10,433 INFO [train.py:1039] (0/4) Epoch 13, batch 4750, loss[loss=0.1971, simple_loss=0.275, pruned_loss=0.05962, over 23939.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2649, pruned_loss=0.06087, over 4675370.67 frames. ], batch size: 86, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:49:12,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:13,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:49:15,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 18:49:15,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:18,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 18:49:20,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:49:20,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:49:22,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:29,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 18:49:29,689 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=456700.0, ans=0.125 2023-09-29 18:49:33,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:49:34,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 18:49:36,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:39,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:40,935 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 18:49:40,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 18:49:42,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=456766.6666666667, ans=0.125 2023-09-29 18:49:48,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 18:49:51,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:53,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:49:56,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:49:56,682 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 18:49:56,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:49:59,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:50:00,279 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=456833.3333333333, ans=0.125 2023-09-29 18:50:02,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:50:04,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 18:50:05,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 18:50:06,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:06,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:50:07,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:07,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:50:07,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 18:50:11,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 18:50:14,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:16,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=456900.0, ans=0.1 2023-09-29 18:50:17,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:50:17,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 18:50:17,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:17,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=456900.0, ans=0.125 2023-09-29 18:50:18,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:21,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:50:22,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:23,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:50:25,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:25,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 18:50:27,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 18:50:28,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 18:50:31,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:50:31,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:33,041 INFO [train.py:1039] (0/4) Epoch 13, batch 4800, loss[loss=0.2039, simple_loss=0.2722, pruned_loss=0.06779, over 23616.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2662, pruned_loss=0.06188, over 4682567.62 frames. ], batch size: 149, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:50:33,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 18:50:40,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:40,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:40,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=456966.6666666667, ans=0.0 2023-09-29 18:50:41,339 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.70 vs. limit=15.0 2023-09-29 18:50:43,543 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.965e+02 2.180e+02 2.463e+02 4.053e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 18:50:47,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:50:48,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:49,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:50,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 18:50:50,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:50,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=457033.3333333333, ans=0.125 2023-09-29 18:50:51,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:50:53,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:50:58,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:59,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:50:59,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:50:59,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:02,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:03,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:04,187 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.20 vs. limit=15.0 2023-09-29 18:51:06,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:07,064 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:51:08,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:08,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:51:09,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:51:12,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:12,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 18:51:14,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 18:51:14,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:15,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:51:15,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:51:15,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:15,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:51:17,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=457100.0, ans=0.125 2023-09-29 18:51:17,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=457100.0, ans=0.1 2023-09-29 18:51:20,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:51:20,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:25,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:28,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:31,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:36,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 18:51:36,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:36,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:38,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:51:39,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:42,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:44,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:51:44,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:44,815 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=457233.3333333333, ans=0.5 2023-09-29 18:51:46,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:51:46,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:51:47,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:51:50,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:51,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:51,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:53,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 18:51:54,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 18:51:55,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:55,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:56,761 INFO [train.py:1039] (0/4) Epoch 13, batch 4850, loss[loss=0.1962, simple_loss=0.2568, pruned_loss=0.06785, over 23736.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2655, pruned_loss=0.06182, over 4681638.05 frames. ], batch size: 232, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:51:56,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:51:56,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:58,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=457300.0, ans=0.2 2023-09-29 18:52:01,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:52:06,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=457300.0, ans=0.0 2023-09-29 18:52:07,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 18:52:08,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:08,187 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=457300.0, ans=0.125 2023-09-29 18:52:13,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:14,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:52:14,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:20,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:21,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:52:23,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:52:23,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 18:52:24,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=457366.6666666667, ans=0.125 2023-09-29 18:52:27,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:52:29,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:52:29,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:52:30,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:52:30,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 18:52:33,956 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=457433.3333333333, ans=0.1 2023-09-29 18:52:35,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:35,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:39,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:39,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 18:52:39,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 18:52:39,748 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-09-29 18:52:40,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:52:49,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:52:49,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 18:52:50,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:52:50,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:52:52,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:52:54,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 18:52:54,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:54,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=457500.0, ans=0.125 2023-09-29 18:52:56,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 18:52:56,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:58,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:52:58,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 18:53:07,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:14,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:53:14,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:18,648 INFO [train.py:1039] (0/4) Epoch 13, batch 4900, loss[loss=0.1914, simple_loss=0.2714, pruned_loss=0.05575, over 23836.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2651, pruned_loss=0.06104, over 4693451.64 frames. ], batch size: 86, lr: 7.85e-03, grad_scale: 16.0 2023-09-29 18:53:22,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 18:53:22,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:53:27,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:29,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:29,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:53:31,868 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=15.0 2023-09-29 18:53:32,646 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 2.076e+02 2.390e+02 2.815e+02 4.365e+02, threshold=4.780e+02, percent-clipped=1.0 2023-09-29 18:53:32,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 18:53:38,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 18:53:43,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 18:53:43,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 18:53:45,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:45,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:45,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:53:46,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:46,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:53:46,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 18:53:50,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 18:53:51,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:53:52,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:53:53,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:55,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:53:56,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:58,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:58,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 18:54:00,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:54:00,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:54:00,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 18:54:00,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 18:54:01,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=457766.6666666667, ans=0.0 2023-09-29 18:54:05,493 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.08 vs. limit=10.0 2023-09-29 18:54:06,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 18:54:06,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:54:08,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:08,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:54:09,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:09,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:54:09,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:54:11,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 18:54:14,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:15,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:54:17,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:54:20,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 18:54:21,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:54:23,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:54:23,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 18:54:31,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:33,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:54:35,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 18:54:35,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:35,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:54:37,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:40,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:54:40,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:54:40,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:41,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:54:41,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:54:43,214 INFO [train.py:1039] (0/4) Epoch 13, batch 4950, loss[loss=0.1899, simple_loss=0.2746, pruned_loss=0.05256, over 24667.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.265, pruned_loss=0.0602, over 4715353.42 frames. ], batch size: 73, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:54:46,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:54:46,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:46,972 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.43 vs. limit=15.0 2023-09-29 18:54:49,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 18:54:49,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=457966.6666666667, ans=0.125 2023-09-29 18:54:50,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 18:54:50,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:54:52,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 18:54:52,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:52,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:52,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:54:52,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:54:54,981 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.72 vs. limit=22.5 2023-09-29 18:54:55,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:57,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:54:57,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:54:58,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:55:02,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:04,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:55:08,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:55:13,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:13,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:55:15,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:16,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:19,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:55:19,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 18:55:21,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 18:55:23,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:26,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:55:26,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:55:27,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:55:27,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:55:29,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:55:30,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:33,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:55:34,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:55:36,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:36,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:38,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 18:55:38,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:55:40,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:55:40,753 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=458166.6666666667, ans=0.1 2023-09-29 18:55:40,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=458166.6666666667, ans=0.125 2023-09-29 18:55:44,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:55:45,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:55:45,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:55:45,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:47,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:55:47,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:55:48,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:55:50,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:55:50,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=458233.3333333333, ans=0.125 2023-09-29 18:55:51,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:51,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 18:55:56,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:55:59,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=458233.3333333333, ans=0.0 2023-09-29 18:56:01,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 18:56:02,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:56:05,430 INFO [train.py:1039] (0/4) Epoch 13, batch 5000, loss[loss=0.1664, simple_loss=0.2459, pruned_loss=0.04345, over 24657.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2636, pruned_loss=0.05969, over 4719610.04 frames. ], batch size: 65, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:56:07,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:07,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:09,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 18:56:10,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 18:56:13,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:56:13,300 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:56:14,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 18:56:14,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:56:14,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:56:16,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 18:56:16,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:17,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:19,081 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.946e+02 2.294e+02 2.903e+02 4.132e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 18:56:19,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 18:56:19,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:19,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:20,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 18:56:20,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 18:56:22,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:56:23,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 18:56:23,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:56:23,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:25,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:56:25,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 18:56:25,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 18:56:25,952 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.87 vs. limit=15.0 2023-09-29 18:56:27,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 18:56:28,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:28,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:30,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 18:56:30,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:30,867 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.14 vs. limit=15.0 2023-09-29 18:56:30,886 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.09 vs. limit=15.0 2023-09-29 18:56:31,140 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.38 vs. limit=15.0 2023-09-29 18:56:31,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:31,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:33,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:56:36,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 18:56:36,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:56:39,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:56:43,418 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 18:56:47,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:49,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:49,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:56:52,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 18:56:52,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:53,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:53,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:55,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:56:56,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:56:59,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:57:01,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:03,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=458500.0, ans=0.1 2023-09-29 18:57:07,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 18:57:10,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=458566.6666666667, ans=0.125 2023-09-29 18:57:11,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:23,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:57:23,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:23,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:57:25,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:25,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:57:25,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:57:25,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:27,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=458633.3333333333, ans=0.125 2023-09-29 18:57:28,251 INFO [train.py:1039] (0/4) Epoch 13, batch 5050, loss[loss=0.201, simple_loss=0.272, pruned_loss=0.06495, over 23999.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2652, pruned_loss=0.06071, over 4710826.28 frames. ], batch size: 86, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:57:29,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:29,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 18:57:31,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:57:32,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:34,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:57:35,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 18:57:36,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:38,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:57:39,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:57:41,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:57:42,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:57:54,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 18:57:54,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:57:54,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:57:54,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 18:57:55,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=458700.0, ans=0.1 2023-09-29 18:57:57,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:57:58,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:57:58,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:58,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:57:58,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 18:58:00,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 18:58:01,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:03,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:06,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=458766.6666666667, ans=0.125 2023-09-29 18:58:07,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:07,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 18:58:10,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:12,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 18:58:12,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:58:13,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:58:14,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:15,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:58:17,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:58:20,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:58:20,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:20,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:58:21,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:58:21,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 18:58:23,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:58:24,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=458833.3333333333, ans=0.125 2023-09-29 18:58:26,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:58:32,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:33,436 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 18:58:33,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:58:33,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:58:33,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:35,784 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 18:58:40,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:40,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 18:58:40,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:40,786 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=458900.0, ans=0.2 2023-09-29 18:58:43,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:43,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:44,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 18:58:46,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 18:58:48,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:58:48,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:58:48,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=458966.6666666667, ans=0.125 2023-09-29 18:58:49,648 INFO [train.py:1039] (0/4) Epoch 13, batch 5100, loss[loss=0.2041, simple_loss=0.2701, pruned_loss=0.06908, over 23755.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.265, pruned_loss=0.06008, over 4721403.93 frames. ], batch size: 212, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:58:49,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:58:52,907 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 18:58:54,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:57,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 18:58:57,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 18:58:59,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:00,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:59:02,841 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.978e+02 2.231e+02 2.583e+02 5.581e+02, threshold=4.463e+02, percent-clipped=1.0 2023-09-29 18:59:03,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:59:04,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 18:59:04,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 18:59:09,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:59:11,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:59:12,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=459033.3333333333, ans=0.0 2023-09-29 18:59:15,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:16,902 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=459033.3333333333, ans=0.1 2023-09-29 18:59:19,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 18:59:19,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:20,480 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.11 vs. limit=6.0 2023-09-29 18:59:22,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:59:22,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:59:25,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:25,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:26,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 18:59:27,282 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=459100.0, ans=0.125 2023-09-29 18:59:28,585 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 18:59:28,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:30,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 18:59:30,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 18:59:34,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:39,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=459166.6666666667, ans=0.125 2023-09-29 18:59:43,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:59:47,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 18:59:47,431 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 18:59:47,444 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 18:59:48,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 18:59:48,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:52,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=459166.6666666667, ans=0.125 2023-09-29 18:59:53,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 18:59:56,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 18:59:59,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:59:59,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:00:02,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 19:00:02,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=459233.3333333333, ans=0.125 2023-09-29 19:00:04,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:00:04,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 19:00:10,217 INFO [train.py:1039] (0/4) Epoch 13, batch 5150, loss[loss=0.195, simple_loss=0.2563, pruned_loss=0.0669, over 23953.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2659, pruned_loss=0.06043, over 4725049.93 frames. ], batch size: 196, lr: 7.83e-03, grad_scale: 8.0 2023-09-29 19:00:10,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:00:10,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:00:10,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:00:10,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:00:10,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:00:12,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:00:12,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=459300.0, ans=0.0 2023-09-29 19:00:14,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 19:00:14,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 19:00:16,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 19:00:16,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:00:16,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 19:00:17,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:17,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:00:20,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:23,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:27,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:00:27,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 19:00:29,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:30,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:00:32,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:00:32,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:32,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:32,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:00:32,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:00:33,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 19:00:34,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:00:34,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:00:37,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:00:38,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 19:00:40,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:00:47,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:00:49,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 19:00:51,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:59,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:59,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:04,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:05,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:07,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 19:01:07,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=459500.0, ans=10.0 2023-09-29 19:01:12,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:01:13,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:01:13,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:01:16,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:18,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:18,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 19:01:19,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=459566.6666666667, ans=0.2 2023-09-29 19:01:22,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:25,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:01:27,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=459566.6666666667, ans=0.09899494936611666 2023-09-29 19:01:28,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:01:28,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:01:30,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:01:30,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:01:30,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:01:32,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:01:32,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=459633.3333333333, ans=0.0 2023-09-29 19:01:33,953 INFO [train.py:1039] (0/4) Epoch 13, batch 5200, loss[loss=0.2092, simple_loss=0.2874, pruned_loss=0.06553, over 24024.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2656, pruned_loss=0.06017, over 4718169.41 frames. ], batch size: 80, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:01:34,999 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.27 vs. limit=15.0 2023-09-29 19:01:35,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:01:37,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:01:40,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:43,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 19:01:46,060 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 2.009e+02 2.232e+02 2.701e+02 3.997e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 19:01:46,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:01:46,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:48,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=459700.0, ans=0.0 2023-09-29 19:01:49,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:50,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:01:51,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:51,330 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=459700.0, ans=0.1 2023-09-29 19:01:53,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 19:01:55,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:01:57,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:59,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 19:02:00,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:02:02,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:02:02,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 19:02:03,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 19:02:05,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 19:02:06,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:06,239 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 19:02:06,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:02:09,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:10,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:02:10,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 19:02:10,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:02:12,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:16,206 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.95 vs. limit=22.5 2023-09-29 19:02:16,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 19:02:16,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 19:02:18,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 19:02:22,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 19:02:23,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:02:26,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=459833.3333333333, ans=0.0 2023-09-29 19:02:30,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:02:30,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:33,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 19:02:33,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:34,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:02:34,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:34,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:02:39,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:40,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:02:44,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:44,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:02:44,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:48,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=459900.0, ans=6.0 2023-09-29 19:02:49,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:50,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 19:02:50,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:52,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:02:52,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:54,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:02:55,509 INFO [train.py:1039] (0/4) Epoch 13, batch 5250, loss[loss=0.1874, simple_loss=0.2357, pruned_loss=0.06955, over 19359.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2652, pruned_loss=0.06021, over 4717431.32 frames. ], batch size: 390, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:02:55,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:02:57,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:03:01,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:01,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:03:03,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:03:09,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:03:10,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=459966.6666666667, ans=0.125 2023-09-29 19:03:11,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:03:11,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=460033.3333333333, ans=0.125 2023-09-29 19:03:14,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:03:14,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:03:18,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 19:03:18,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:19,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:03:22,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=460033.3333333333, ans=0.125 2023-09-29 19:03:32,908 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=460100.0, ans=0.125 2023-09-29 19:03:42,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=460166.6666666667, ans=0.1 2023-09-29 19:03:42,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=460166.6666666667, ans=0.125 2023-09-29 19:04:03,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=460233.3333333333, ans=0.2 2023-09-29 19:04:11,508 INFO [train.py:1039] (0/4) Epoch 13, batch 5300, loss[loss=0.1826, simple_loss=0.2362, pruned_loss=0.06451, over 23648.00 frames. ], tot_loss[loss=0.192, simple_loss=0.2641, pruned_loss=0.05999, over 4709343.52 frames. ], batch size: 256, lr: 7.82e-03, grad_scale: 16.0 2023-09-29 19:04:11,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=460300.0, ans=0.0 2023-09-29 19:04:22,459 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.900e+02 2.152e+02 2.840e+02 4.256e+02, threshold=4.304e+02, percent-clipped=0.0 2023-09-29 19:04:23,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=460300.0, ans=0.1 2023-09-29 19:04:26,328 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-13.pt 2023-09-29 19:04:31,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:04:31,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 19:04:31,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 19:04:31,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:32,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:32,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:32,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:32,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:32,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:04:32,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:32,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:04:33,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:04:33,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 19:04:33,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 19:04:33,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 19:04:33,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:04:33,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 19:04:33,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 19:04:33,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:34,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:34,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:34,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:35,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:04:35,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:35,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:35,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:35,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:35,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:35,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:04:35,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:35,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:04:36,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 19:04:36,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:37,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:37,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 19:04:37,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 19:04:37,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:04:37,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:04:37,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 19:04:38,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 19:04:38,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:39,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:04:39,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:39,508 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 19:04:39,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 19:04:39,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:04:39,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:39,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 19:04:40,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 19:04:40,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 19:04:40,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:43,512 INFO [train.py:1039] (0/4) Epoch 14, batch 0, loss[loss=0.1865, simple_loss=0.2544, pruned_loss=0.0593, over 23719.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2544, pruned_loss=0.0593, over 23719.00 frames. ], batch size: 164, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:04:43,513 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 19:04:58,063 INFO [train.py:1071] (0/4) Epoch 14, validation: loss=0.2893, simple_loss=0.2709, pruned_loss=0.1538, over 1125622.00 frames. 2023-09-29 19:04:58,064 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-29 19:05:00,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 19:05:01,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:05:03,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:05:09,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:09,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:05:10,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:10,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 19:05:12,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=460446.6666666667, ans=0.125 2023-09-29 19:05:13,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 19:05:17,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:18,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:20,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=460446.6666666667, ans=0.0 2023-09-29 19:05:22,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:24,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:24,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:05:24,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:26,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 19:05:28,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:37,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:05:37,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:40,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 19:05:42,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=460513.3333333333, ans=0.125 2023-09-29 19:05:45,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:05:45,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:05:46,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:05:50,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:05:55,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:00,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 19:06:05,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 19:06:05,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:05,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:07,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:06:07,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:10,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 19:06:13,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:14,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:14,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=460646.6666666667, ans=0.125 2023-09-29 19:06:17,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:06:20,124 INFO [train.py:1039] (0/4) Epoch 14, batch 50, loss[loss=0.1881, simple_loss=0.2735, pruned_loss=0.05132, over 24289.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2661, pruned_loss=0.05888, over 1075822.32 frames. ], batch size: 74, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:06:20,355 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 19:06:21,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:06:24,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:26,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:26,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 19:06:26,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=460713.3333333333, ans=0.1 2023-09-29 19:06:28,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:06:28,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:06:31,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:33,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:35,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:37,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=460780.0, ans=0.125 2023-09-29 19:06:38,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 19:06:38,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:45,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:06:46,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=460780.0, ans=0.125 2023-09-29 19:06:48,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 19:06:50,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 19:06:50,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:06:52,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:06:52,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:52,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:53,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:06:55,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:06:55,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:07:01,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:04,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:04,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:07:06,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 19:07:09,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:07:11,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:07:11,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 19:07:11,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:12,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 19:07:20,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:07:20,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:20,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:22,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:22,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:25,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 19:07:26,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 19:07:26,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:26,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:29,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:07:30,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:30,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 19:07:31,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 19:07:32,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=460980.0, ans=0.1 2023-09-29 19:07:33,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 19:07:35,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:35,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:07:35,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 19:07:35,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 19:07:37,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:38,445 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.985e+02 2.220e+02 2.670e+02 4.594e+02, threshold=4.441e+02, percent-clipped=1.0 2023-09-29 19:07:38,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:40,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:07:40,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:07:43,622 INFO [train.py:1039] (0/4) Epoch 14, batch 100, loss[loss=0.2076, simple_loss=0.2704, pruned_loss=0.0724, over 23766.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.271, pruned_loss=0.06294, over 1860637.44 frames. ], batch size: 232, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:07:43,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:07:45,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:07:50,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:07:52,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 19:07:52,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:52,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=461046.6666666667, ans=0.0 2023-09-29 19:07:57,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:07:58,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:07:58,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:58,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:58,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:08:00,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 19:08:00,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:08:01,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:01,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:01,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:08:04,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 19:08:07,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:07,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:08,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:08:10,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:08:10,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=461113.3333333333, ans=0.125 2023-09-29 19:08:13,307 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 19:08:15,210 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 19:08:16,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:08:16,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:08:19,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:08:22,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:25,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:30,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:31,740 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 19:08:31,947 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=461246.6666666667, ans=0.125 2023-09-29 19:08:33,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:08:37,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:08:39,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:08:41,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:44,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:47,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:08:47,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:08:50,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=461313.3333333333, ans=0.1 2023-09-29 19:08:51,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:53,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:54,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:54,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:08:54,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:56,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 19:08:56,492 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 19:08:57,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:58,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:08:58,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=461313.3333333333, ans=0.125 2023-09-29 19:09:00,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:00,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:00,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 19:09:00,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:09:01,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:09:01,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:03,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:05,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:05,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:09:05,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:09:05,614 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=461380.0, ans=0.125 2023-09-29 19:09:06,634 INFO [train.py:1039] (0/4) Epoch 14, batch 150, loss[loss=0.1593, simple_loss=0.2351, pruned_loss=0.04179, over 21825.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2698, pruned_loss=0.06165, over 2509407.73 frames. ], batch size: 48, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:09:07,278 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.09 vs. limit=15.0 2023-09-29 19:09:08,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:10,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:09:10,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:11,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:11,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=461380.0, ans=0.125 2023-09-29 19:09:14,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:16,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:19,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:09:19,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:24,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 19:09:24,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 19:09:24,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 19:09:27,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:09:27,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:09:29,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:09:29,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:09:29,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:30,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:31,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:32,642 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 19:09:34,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:40,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:43,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:09:44,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 19:09:47,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:09:47,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:47,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:09:51,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:09:52,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:54,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:09:56,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:56,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 19:10:02,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=461580.0, ans=0.125 2023-09-29 19:10:03,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:03,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:04,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:10:04,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:10:07,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:08,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 19:10:12,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:10:13,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:10:13,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:16,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:10:16,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 19:10:16,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:10:16,456 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 19:10:18,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=461646.6666666667, ans=0.125 2023-09-29 19:10:21,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:23,776 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.853e+02 2.115e+02 2.469e+02 4.470e+02, threshold=4.229e+02, percent-clipped=1.0 2023-09-29 19:10:26,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:10:26,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:10:29,880 INFO [train.py:1039] (0/4) Epoch 14, batch 200, loss[loss=0.1819, simple_loss=0.2603, pruned_loss=0.05179, over 24467.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2696, pruned_loss=0.06103, over 3008784.18 frames. ], batch size: 63, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:10:30,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 19:10:30,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:30,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:34,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 19:10:36,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:10:37,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:38,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:38,143 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=461713.3333333333, ans=0.125 2023-09-29 19:10:43,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:10:43,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:43,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:50,050 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=461780.0, ans=0.0 2023-09-29 19:10:55,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=461780.0, ans=0.0 2023-09-29 19:11:01,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:11:03,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:11:04,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:11:05,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:11:05,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:11:05,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:11:06,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:08,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:11:08,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:09,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:11,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 19:11:12,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:11:12,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:16,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:11:23,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:33,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:35,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:11:39,922 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.83 vs. limit=15.0 2023-09-29 19:11:42,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:45,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 19:11:46,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:46,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:11:47,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:47,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:11:49,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 19:11:49,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:11:49,524 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 19:11:50,900 INFO [train.py:1039] (0/4) Epoch 14, batch 250, loss[loss=0.175, simple_loss=0.2488, pruned_loss=0.0506, over 24432.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2674, pruned_loss=0.06037, over 3391955.04 frames. ], batch size: 58, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:11:52,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:54,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:11:56,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:56,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:57,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:11:57,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:59,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:12:03,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:12:07,806 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=462113.3333333333, ans=0.0 2023-09-29 19:12:17,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:19,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:12:20,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:12:26,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:12:28,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:12:28,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:12:28,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:30,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:12:30,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:12:30,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:32,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:12:32,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=462180.0, ans=0.1 2023-09-29 19:12:35,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 19:12:35,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:36,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:12:36,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:12:36,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:12:38,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:12:38,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:12:38,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:12:42,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:42,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:12:44,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:12:47,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:12:51,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:55,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:13:00,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:01,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:13:04,719 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.56 vs. limit=15.0 2023-09-29 19:13:05,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 19:13:05,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=462313.3333333333, ans=0.0 2023-09-29 19:13:07,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:07,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:13:10,062 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.958e+02 2.110e+02 2.520e+02 4.183e+02, threshold=4.220e+02, percent-clipped=0.0 2023-09-29 19:13:10,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 19:13:10,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:13:11,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:13:11,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 19:13:13,248 INFO [train.py:1039] (0/4) Epoch 14, batch 300, loss[loss=0.1826, simple_loss=0.2233, pruned_loss=0.07098, over 19071.00 frames. ], tot_loss[loss=0.192, simple_loss=0.2647, pruned_loss=0.05969, over 3690161.17 frames. ], batch size: 388, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:13:15,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=462380.0, ans=0.125 2023-09-29 19:13:19,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:13:19,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:13:19,995 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.62 vs. limit=15.0 2023-09-29 19:13:20,374 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.37 vs. limit=22.5 2023-09-29 19:13:22,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:13:24,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 19:13:24,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=462380.0, ans=0.125 2023-09-29 19:13:26,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:27,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:13:27,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 19:13:27,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:27,904 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=462380.0, ans=0.1 2023-09-29 19:13:31,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:13:37,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:13:37,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 19:13:42,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 19:13:43,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:45,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:47,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:47,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 19:13:47,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:13:50,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:13:51,813 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.51 vs. limit=10.0 2023-09-29 19:13:53,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:13:53,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:58,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:13:58,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 19:13:58,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:14:02,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:04,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 19:14:05,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:08,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:14:12,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:14:12,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 19:14:18,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:18,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:14:21,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:22,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:14:22,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 19:14:23,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:14:23,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:14:26,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 19:14:27,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:27,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:28,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=462646.6666666667, ans=0.0 2023-09-29 19:14:29,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:29,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:29,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:32,188 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=12.0 2023-09-29 19:14:35,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:37,256 INFO [train.py:1039] (0/4) Epoch 14, batch 350, loss[loss=0.205, simple_loss=0.2667, pruned_loss=0.0716, over 23744.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2616, pruned_loss=0.05877, over 3906775.92 frames. ], batch size: 164, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:14:37,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:14:39,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:46,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:50,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:50,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:53,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 19:14:55,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:55,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 19:14:55,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=462780.0, ans=0.125 2023-09-29 19:14:58,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:58,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 19:14:59,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=462780.0, ans=0.125 2023-09-29 19:15:00,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:02,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 19:15:03,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:15:07,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:07,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:15:09,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:09,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:10,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:15:13,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:15:13,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:22,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:15:22,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:15:23,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:15:24,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:28,124 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.00 vs. limit=10.0 2023-09-29 19:15:30,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 19:15:30,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:35,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:35,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:35,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:15:38,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 19:15:38,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=462913.3333333333, ans=0.125 2023-09-29 19:15:41,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:41,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=462980.0, ans=0.2 2023-09-29 19:15:43,175 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 19:15:45,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 19:15:46,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:48,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:48,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 19:15:49,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:50,770 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.26 vs. limit=22.5 2023-09-29 19:15:51,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:15:53,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:54,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=462980.0, ans=0.07 2023-09-29 19:15:54,046 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:15:57,094 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.848e+02 2.063e+02 2.317e+02 4.440e+02, threshold=4.125e+02, percent-clipped=1.0 2023-09-29 19:15:57,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:57,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:58,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:16:00,133 INFO [train.py:1039] (0/4) Epoch 14, batch 400, loss[loss=0.2018, simple_loss=0.2716, pruned_loss=0.06597, over 23592.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.262, pruned_loss=0.05846, over 4096334.32 frames. ], batch size: 149, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:16:02,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:16:05,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:16:05,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 19:16:05,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:06,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:08,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:16:08,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:10,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:13,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:17,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 19:16:18,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 19:16:18,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:19,474 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.31 vs. limit=22.5 2023-09-29 19:16:20,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 19:16:20,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:20,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=463113.3333333333, ans=0.125 2023-09-29 19:16:23,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:16:24,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:24,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 19:16:26,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:16:26,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:26,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:28,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:31,377 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 19:16:31,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 19:16:34,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:35,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=463180.0, ans=0.125 2023-09-29 19:16:36,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:37,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 19:16:39,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 19:16:42,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:16:45,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:16:48,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=463246.6666666667, ans=0.125 2023-09-29 19:16:49,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=463246.6666666667, ans=0.125 2023-09-29 19:16:51,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=463246.6666666667, ans=0.125 2023-09-29 19:16:52,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 19:16:54,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=463246.6666666667, ans=0.125 2023-09-29 19:16:55,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:16:57,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 19:17:00,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:17:01,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:17:01,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 19:17:05,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:17:07,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:17:08,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:17:11,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:13,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 19:17:14,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:17:15,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 19:17:17,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:17:18,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:17:20,953 INFO [train.py:1039] (0/4) Epoch 14, batch 450, loss[loss=0.2582, simple_loss=0.3028, pruned_loss=0.1068, over 19588.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2628, pruned_loss=0.05902, over 4238914.38 frames. ], batch size: 389, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:17:21,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 19:17:25,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:17:25,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:17:25,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:17:26,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 19:17:26,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:17:28,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:17:28,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:17:28,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 19:17:28,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=463380.0, ans=0.035 2023-09-29 19:17:29,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:17:30,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:17:30,817 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.13 vs. limit=15.0 2023-09-29 19:17:31,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:17:42,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:42,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:17:45,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 19:17:46,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 19:17:48,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:17:50,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:53,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:17:57,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:17:57,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:17:57,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=463513.3333333333, ans=0.2 2023-09-29 19:18:00,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 19:18:00,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 19:18:01,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 19:18:02,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:03,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:04,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:18:07,152 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 19:18:07,166 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 19:18:07,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:18:10,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:18:10,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:18:15,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:18:15,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:18:16,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:18:17,348 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.63 vs. limit=10.0 2023-09-29 19:18:18,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 19:18:21,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:24,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:18:24,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:18:25,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 19:18:30,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:18:32,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 19:18:32,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 19:18:34,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:39,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:18:40,721 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.869e+02 2.102e+02 2.617e+02 3.390e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-29 19:18:40,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:18:43,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:18:43,151 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 19:18:44,438 INFO [train.py:1039] (0/4) Epoch 14, batch 500, loss[loss=0.1693, simple_loss=0.2497, pruned_loss=0.04448, over 24575.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2631, pruned_loss=0.05949, over 4334687.69 frames. ], batch size: 60, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:18:46,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:48,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:18:48,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:48,446 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 19:18:48,792 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=463713.3333333333, ans=0.0 2023-09-29 19:18:51,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 19:18:51,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:55,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:18:59,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:19:00,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:19:03,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:19:03,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:19:05,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:14,254 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=463780.0, ans=0.125 2023-09-29 19:19:15,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:16,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:19:16,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:19:18,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:18,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 19:19:18,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:19:20,905 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.09 vs. limit=15.0 2023-09-29 19:19:22,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:19:23,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:19:23,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:19:25,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:26,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 19:19:29,651 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 19:19:32,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:34,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:36,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:19:39,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 19:19:42,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:19:44,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:47,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:19:50,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:57,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:59,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 19:19:59,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:59,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:20:04,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 19:20:05,335 INFO [train.py:1039] (0/4) Epoch 14, batch 550, loss[loss=0.1889, simple_loss=0.275, pruned_loss=0.05145, over 24354.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2644, pruned_loss=0.06025, over 4413909.71 frames. ], batch size: 74, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:20:05,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:20:07,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:08,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=464046.6666666667, ans=0.125 2023-09-29 19:20:13,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 19:20:14,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 19:20:14,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:14,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 19:20:15,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:20:17,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:17,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:19,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:19,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:20:20,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:20:22,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:23,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 19:20:23,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:20:28,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:28,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:30,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:20:30,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:37,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 19:20:38,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 19:20:39,299 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.02 vs. limit=6.0 2023-09-29 19:20:40,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:20:45,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:20:45,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:45,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=464180.0, ans=0.125 2023-09-29 19:20:46,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:20:50,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:50,475 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 19:20:52,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:52,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:20:55,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:57,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:20:57,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:20:57,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:58,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 19:20:59,796 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.67 vs. limit=10.0 2023-09-29 19:21:00,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 19:21:01,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:01,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:21:01,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:01,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:21:05,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:21:07,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:21:10,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:21:10,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:12,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 19:21:13,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:21:15,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:15,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:21:16,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:18,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:21:18,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:21:22,664 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.73 vs. limit=22.5 2023-09-29 19:21:25,256 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.884e+02 2.077e+02 2.403e+02 3.738e+02, threshold=4.154e+02, percent-clipped=0.0 2023-09-29 19:21:25,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 19:21:28,504 INFO [train.py:1039] (0/4) Epoch 14, batch 600, loss[loss=0.1876, simple_loss=0.2713, pruned_loss=0.05191, over 24534.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2651, pruned_loss=0.0609, over 4461583.41 frames. ], batch size: 71, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:21:30,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 19:21:31,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:21:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:21:31,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:36,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=464380.0, ans=0.5 2023-09-29 19:21:38,888 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=464380.0, ans=10.0 2023-09-29 19:21:40,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:21:40,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:21:42,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 19:21:45,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:21:46,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:21:49,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:51,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 19:21:51,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:52,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=464446.6666666667, ans=0.1 2023-09-29 19:21:54,664 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.70 vs. limit=15.0 2023-09-29 19:21:58,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 19:22:01,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:22:01,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:02,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:22:02,154 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=464513.3333333333, ans=10.0 2023-09-29 19:22:08,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:22:08,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:22:10,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:15,666 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=464513.3333333333, ans=0.5 2023-09-29 19:22:16,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:22:21,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:21,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:22:21,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:29,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=464580.0, ans=0.125 2023-09-29 19:22:32,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 19:22:36,485 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=464646.6666666667, ans=0.125 2023-09-29 19:22:36,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=464646.6666666667, ans=0.125 2023-09-29 19:22:37,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:22:37,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:22:42,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 19:22:42,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:22:44,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 19:22:44,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:22:46,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:22:50,972 INFO [train.py:1039] (0/4) Epoch 14, batch 650, loss[loss=0.2024, simple_loss=0.2618, pruned_loss=0.07153, over 23806.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.264, pruned_loss=0.06015, over 4498104.33 frames. ], batch size: 150, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:22:51,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:22:53,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:22:56,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:22:56,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:22:58,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:22:58,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 19:22:58,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=464713.3333333333, ans=0.125 2023-09-29 19:23:00,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:23:06,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:23:06,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:10,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:13,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 19:23:15,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:15,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:20,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:20,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:23:23,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:23,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:23,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:23:25,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:26,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:23:29,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:23:29,628 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 19:23:29,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:29,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:33,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:35,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:35,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:23:35,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:23:36,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 19:23:36,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:23:38,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:23:39,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:23:39,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:41,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:23:41,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 19:23:44,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 19:23:44,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:44,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:44,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:23:44,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:48,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:54,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:54,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:56,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:24:00,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:00,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:24:02,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:09,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:24:09,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:09,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:10,967 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.056e+02 2.592e+02 3.186e+02 5.109e+02, threshold=5.184e+02, percent-clipped=6.0 2023-09-29 19:24:11,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:13,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=465046.6666666667, ans=0.2 2023-09-29 19:24:13,270 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.11 vs. limit=10.0 2023-09-29 19:24:14,008 INFO [train.py:1039] (0/4) Epoch 14, batch 700, loss[loss=0.1864, simple_loss=0.2542, pruned_loss=0.05929, over 23736.00 frames. ], tot_loss[loss=0.1913, simple_loss=0.2626, pruned_loss=0.05997, over 4526962.99 frames. ], batch size: 212, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:24:17,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 19:24:17,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 19:24:20,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 19:24:22,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:23,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:24:23,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=465046.6666666667, ans=0.0 2023-09-29 19:24:25,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 19:24:25,551 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=465046.6666666667, ans=0.07 2023-09-29 19:24:30,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:33,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:24:35,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:38,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:24:38,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:24:40,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:40,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=465113.3333333333, ans=0.0 2023-09-29 19:24:43,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:24:43,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:24:44,065 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=465113.3333333333, ans=0.0 2023-09-29 19:24:47,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 19:24:50,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 19:24:53,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:24:53,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:24:55,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:25:00,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:25:00,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 19:25:03,132 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.93 vs. limit=15.0 2023-09-29 19:25:03,298 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.01 vs. limit=10.0 2023-09-29 19:25:04,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=465246.6666666667, ans=0.1 2023-09-29 19:25:06,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:06,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:25:06,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 19:25:06,509 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=465246.6666666667, ans=0.1 2023-09-29 19:25:10,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:25:12,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:14,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=465246.6666666667, ans=0.125 2023-09-29 19:25:17,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:25:17,492 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=465246.6666666667, ans=0.125 2023-09-29 19:25:19,439 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.21 vs. limit=15.0 2023-09-29 19:25:24,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:25:24,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 19:25:27,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 19:25:27,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 19:25:30,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:32,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:32,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:25:34,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:34,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 19:25:37,500 INFO [train.py:1039] (0/4) Epoch 14, batch 750, loss[loss=0.2015, simple_loss=0.2638, pruned_loss=0.06953, over 23441.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2628, pruned_loss=0.05964, over 4580662.99 frames. ], batch size: 285, lr: 7.50e-03, grad_scale: 8.0 2023-09-29 19:25:38,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=465380.0, ans=0.1 2023-09-29 19:25:39,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 19:25:39,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 19:25:39,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 19:25:42,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 19:25:42,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 19:25:42,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:25:44,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 19:25:45,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:45,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:25:47,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:50,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:50,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:25:51,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:52,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:25:52,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:25:56,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:25:57,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:59,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:59,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 19:26:00,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:26:02,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:04,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:05,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:26:05,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 19:26:05,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:09,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 19:26:09,635 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 19:26:11,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 19:26:11,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:26:12,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:26:14,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:26:16,577 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:26:21,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:26:21,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:21,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:26:24,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:26:26,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:26:26,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 19:26:28,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:26:28,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 19:26:29,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:26:31,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:26:32,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 19:26:33,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:39,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:26:40,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:26:41,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:26:44,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:26:45,196 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.33 vs. limit=15.0 2023-09-29 19:26:47,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 19:26:47,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:26:49,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:51,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:52,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:54,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:54,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:26:59,101 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 2.067e+02 2.400e+02 2.919e+02 4.074e+02, threshold=4.801e+02, percent-clipped=0.0 2023-09-29 19:27:01,193 INFO [train.py:1039] (0/4) Epoch 14, batch 800, loss[loss=0.2091, simple_loss=0.2763, pruned_loss=0.07099, over 23853.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2633, pruned_loss=0.05995, over 4606168.68 frames. ], batch size: 195, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:27:02,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:02,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:04,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:27:04,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:05,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:06,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:07,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:08,334 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.73 vs. limit=6.0 2023-09-29 19:27:10,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:12,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:27:12,451 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:27:16,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 19:27:16,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:17,749 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.75 vs. limit=22.5 2023-09-29 19:27:18,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:18,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:27:18,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:21,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 19:27:21,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:21,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 19:27:24,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:25,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:28,121 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.69 vs. limit=10.0 2023-09-29 19:27:29,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:27:29,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:32,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:32,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:35,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:27:35,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=465846.6666666667, ans=0.125 2023-09-29 19:27:37,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:27:37,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 19:27:38,821 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 19:27:40,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 19:27:40,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:27:40,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:43,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:43,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:27:46,347 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 19:27:46,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 19:27:48,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:27:50,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:27:53,270 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.49 vs. limit=15.0 2023-09-29 19:27:54,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:27:57,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:28:00,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 19:28:00,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:28:03,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 19:28:10,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:14,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:28:14,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 19:28:16,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:28:17,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:17,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 19:28:19,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:20,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:28:20,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:22,317 INFO [train.py:1039] (0/4) Epoch 14, batch 850, loss[loss=0.1904, simple_loss=0.2647, pruned_loss=0.05802, over 24625.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.265, pruned_loss=0.06119, over 4618671.26 frames. ], batch size: 65, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:28:22,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:28:23,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:28:24,908 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.30 vs. limit=15.0 2023-09-29 19:28:26,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 19:28:26,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 19:28:26,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 19:28:27,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:27,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:28:30,439 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.11 vs. limit=22.5 2023-09-29 19:28:30,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:30,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:31,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:28:37,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:37,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:28:39,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 19:28:43,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 19:28:46,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:47,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 19:28:50,132 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=12.0 2023-09-29 19:28:52,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 19:28:52,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 19:28:55,743 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 19:28:55,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:28:55,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:28:55,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:28:59,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 19:29:03,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:29:03,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=466180.0, ans=0.125 2023-09-29 19:29:04,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:06,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:29:06,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:29:09,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:29:09,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=466180.0, ans=0.125 2023-09-29 19:29:10,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:29:10,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 19:29:16,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:29:16,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:16,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:29:18,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:19,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:22,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:24,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:29:27,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:29:27,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:28,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:29:38,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:29:39,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:39,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 19:29:39,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:40,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:42,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 19:29:43,736 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.955e+02 2.156e+02 2.451e+02 4.149e+02, threshold=4.312e+02, percent-clipped=0.0 2023-09-29 19:29:45,184 INFO [train.py:1039] (0/4) Epoch 14, batch 900, loss[loss=0.1846, simple_loss=0.2692, pruned_loss=0.05006, over 24657.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2661, pruned_loss=0.06106, over 4651545.76 frames. ], batch size: 73, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:29:46,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:29:48,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:50,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 19:29:53,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:29:53,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 19:29:54,673 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=466380.0, ans=0.125 2023-09-29 19:29:56,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:29:57,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:57,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:29:57,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:29:59,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:30:03,188 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.19 vs. limit=10.0 2023-09-29 19:30:12,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:12,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:30:12,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:30:16,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=466446.6666666667, ans=0.0 2023-09-29 19:30:17,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:21,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=466513.3333333333, ans=0.0 2023-09-29 19:30:22,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 19:30:25,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:30:27,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=466513.3333333333, ans=0.125 2023-09-29 19:30:30,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:30:31,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:30:33,405 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 19:30:35,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 19:30:38,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=466580.0, ans=0.1 2023-09-29 19:30:41,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:30:41,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:30:41,757 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.63 vs. limit=22.5 2023-09-29 19:30:42,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:30:49,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:49,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:30:53,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 19:30:53,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:56,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 19:30:57,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:30:57,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:59,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:30:59,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:04,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 19:31:04,465 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 19:31:06,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:31:06,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 19:31:08,202 INFO [train.py:1039] (0/4) Epoch 14, batch 950, loss[loss=0.1884, simple_loss=0.2654, pruned_loss=0.0557, over 24548.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2659, pruned_loss=0.06039, over 4674488.66 frames. ], batch size: 71, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:31:08,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:12,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 19:31:18,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:20,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:31:24,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=466780.0, ans=0.125 2023-09-29 19:31:25,964 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 19:31:31,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:32,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=466780.0, ans=0.0 2023-09-29 19:31:33,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:33,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:33,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=466780.0, ans=0.09899494936611666 2023-09-29 19:31:34,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:31:34,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 19:31:34,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:31:36,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:38,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 19:31:38,358 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=466780.0, ans=0.02 2023-09-29 19:31:39,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:44,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:44,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:44,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:44,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 19:31:46,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:31:46,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:46,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=466846.6666666667, ans=0.0 2023-09-29 19:31:50,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:31:56,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:31:56,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:58,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 19:32:02,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:32:02,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:32:03,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:04,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:04,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:32:04,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=466913.3333333333, ans=0.125 2023-09-29 19:32:10,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 19:32:10,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:32:11,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:13,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:13,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 19:32:13,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:13,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:32:13,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=466980.0, ans=0.1 2023-09-29 19:32:14,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 19:32:18,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:32:23,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:25,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=466980.0, ans=0.125 2023-09-29 19:32:28,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:30,240 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.850e+02 2.095e+02 2.342e+02 3.294e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 19:32:30,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 19:32:30,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 19:32:31,867 INFO [train.py:1039] (0/4) Epoch 14, batch 1000, loss[loss=0.2002, simple_loss=0.2778, pruned_loss=0.06129, over 24641.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2646, pruned_loss=0.05995, over 4691141.52 frames. ], batch size: 65, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:32:32,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:37,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 19:32:37,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:32:40,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=467046.6666666667, ans=0.0 2023-09-29 19:32:43,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:32:43,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=467046.6666666667, ans=0.0 2023-09-29 19:32:45,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 19:32:45,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 19:32:48,769 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.79 vs. limit=15.0 2023-09-29 19:32:49,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:32:49,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:51,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:54,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 19:32:57,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=467113.3333333333, ans=0.125 2023-09-29 19:32:59,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 19:33:00,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 19:33:02,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:05,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 19:33:06,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 19:33:06,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 19:33:08,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:10,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:12,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=467180.0, ans=0.0 2023-09-29 19:33:18,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:18,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:33:18,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:19,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:19,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 19:33:19,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:21,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:33:21,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:22,922 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 19:33:23,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=467246.6666666667, ans=0.0 2023-09-29 19:33:24,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 19:33:25,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=467246.6666666667, ans=0.0 2023-09-29 19:33:26,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 19:33:29,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 19:33:32,287 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-09-29 19:33:33,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:33:36,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=467313.3333333333, ans=0.1 2023-09-29 19:33:37,289 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.45 vs. limit=15.0 2023-09-29 19:33:38,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:38,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:33:38,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:39,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:33:43,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 19:33:43,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:33:44,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 19:33:45,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 19:33:46,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:33:46,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:48,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:33:51,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:33:51,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:54,433 INFO [train.py:1039] (0/4) Epoch 14, batch 1050, loss[loss=0.1868, simple_loss=0.2511, pruned_loss=0.06125, over 23822.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2626, pruned_loss=0.05948, over 4692204.42 frames. ], batch size: 212, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:33:56,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:33:56,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:33:57,968 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=467380.0, ans=0.0 2023-09-29 19:33:59,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:33:59,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=467380.0, ans=0.0 2023-09-29 19:34:01,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:04,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:06,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:34:08,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:34:10,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:34:10,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:34:10,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:34:12,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:34:12,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 19:34:13,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:15,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 19:34:17,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:34:17,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 19:34:17,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:34:21,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=467446.6666666667, ans=0.025 2023-09-29 19:34:25,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:25,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:34:27,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:29,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 19:34:29,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 19:34:30,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:32,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 19:34:34,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 19:34:36,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:34:39,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:34:42,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:34:44,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:34:45,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:34:47,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:34:51,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 19:34:53,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 19:34:53,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 19:34:55,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:34:55,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:34:56,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 19:35:01,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:35:02,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:35:02,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:02,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:04,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:08,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:08,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 19:35:09,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:09,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 19:35:09,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 19:35:11,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:35:11,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=467646.6666666667, ans=0.125 2023-09-29 19:35:15,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:35:17,481 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.954e+02 2.210e+02 2.556e+02 3.990e+02, threshold=4.421e+02, percent-clipped=0.0 2023-09-29 19:35:19,002 INFO [train.py:1039] (0/4) Epoch 14, batch 1100, loss[loss=0.2127, simple_loss=0.2736, pruned_loss=0.07593, over 23762.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2627, pruned_loss=0.0593, over 4705650.83 frames. ], batch size: 179, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:35:23,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:35:25,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=467713.3333333333, ans=0.125 2023-09-29 19:35:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:35:28,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:35:28,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:28,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 19:35:30,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:35:31,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:35:32,745 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.48 vs. limit=15.0 2023-09-29 19:35:33,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:35:37,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:35:38,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 19:35:38,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:35:40,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:40,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:40,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=467780.0, ans=0.125 2023-09-29 19:35:44,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:35:46,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:35:52,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:35:54,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 19:35:55,743 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 19:35:57,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:36:01,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:36:03,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 19:36:03,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:36:03,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:36:03,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:36:04,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:04,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 19:36:08,107 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=467913.3333333333, ans=0.04949747468305833 2023-09-29 19:36:12,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:36:12,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 19:36:14,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:36:19,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:36:22,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 19:36:22,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:36:23,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=467980.0, ans=0.1 2023-09-29 19:36:23,722 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.00 vs. limit=22.5 2023-09-29 19:36:25,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:28,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:28,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:29,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 19:36:29,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:36:31,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:31,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 19:36:31,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:36:31,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 19:36:34,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:36:34,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:36:35,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:36:37,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=467980.0, ans=0.0 2023-09-29 19:36:40,207 INFO [train.py:1039] (0/4) Epoch 14, batch 1150, loss[loss=0.1785, simple_loss=0.2615, pruned_loss=0.04779, over 24335.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2633, pruned_loss=0.05927, over 4705062.99 frames. ], batch size: 61, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:36:41,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:45,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:36:46,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:48,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:36:48,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 19:36:48,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:36:51,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 19:36:52,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:52,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:36:58,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 19:37:02,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:06,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:37:07,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:08,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 19:37:08,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:37:08,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:37:11,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 19:37:11,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:13,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:37:23,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:31,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:31,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 19:37:31,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:33,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:37,489 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 19:37:38,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:39,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=468246.6666666667, ans=0.1 2023-09-29 19:37:45,781 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.58 vs. limit=15.0 2023-09-29 19:37:46,575 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 19:37:49,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:37:49,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=468313.3333333333, ans=0.1 2023-09-29 19:37:51,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:37:51,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:37:51,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:37:55,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=468313.3333333333, ans=0.1 2023-09-29 19:37:56,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:38:01,581 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.871e+02 2.319e+02 2.937e+02 5.340e+02, threshold=4.639e+02, percent-clipped=1.0 2023-09-29 19:38:01,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:38:01,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:38:02,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=468380.0, ans=0.125 2023-09-29 19:38:03,177 INFO [train.py:1039] (0/4) Epoch 14, batch 1200, loss[loss=0.1916, simple_loss=0.2709, pruned_loss=0.05615, over 24333.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.264, pruned_loss=0.05886, over 4713769.77 frames. ], batch size: 77, lr: 7.48e-03, grad_scale: 32.0 2023-09-29 19:38:03,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:03,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:03,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=468380.0, ans=0.125 2023-09-29 19:38:04,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:38:08,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:38:10,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:38:11,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:38:13,133 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.54 vs. limit=15.0 2023-09-29 19:38:13,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:13,866 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:38:14,999 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 19:38:16,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 19:38:21,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:38:22,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:38:26,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:27,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:38:27,946 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 19:38:28,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:32,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=468446.6666666667, ans=0.2 2023-09-29 19:38:37,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:38:37,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:38:37,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 19:38:39,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:38:44,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 19:38:47,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 19:38:47,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:49,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:49,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:38:50,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:38:53,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:53,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:38:53,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:38:55,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 19:38:56,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:38:56,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:38:56,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:39:00,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:00,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:02,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:39:03,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:39:06,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 19:39:10,632 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 19:39:14,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:17,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:39:18,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:39:21,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:21,355 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=468646.6666666667, ans=0.0 2023-09-29 19:39:21,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=468646.6666666667, ans=0.125 2023-09-29 19:39:24,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 19:39:25,445 INFO [train.py:1039] (0/4) Epoch 14, batch 1250, loss[loss=0.2635, simple_loss=0.3131, pruned_loss=0.1069, over 19504.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2659, pruned_loss=0.06008, over 4706315.26 frames. ], batch size: 388, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:39:28,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:39:29,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:30,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 19:39:33,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:39:34,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:39:37,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=468713.3333333333, ans=0.95 2023-09-29 19:39:39,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:39:40,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:41,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:39:41,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:43,225 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=468780.0, ans=0.125 2023-09-29 19:39:44,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:39:46,839 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=468780.0, ans=0.125 2023-09-29 19:39:48,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 19:39:49,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:39:49,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:50,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:50,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:39:54,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:56,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:40:00,782 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.22 vs. limit=15.0 2023-09-29 19:40:01,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 19:40:02,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:40:05,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:06,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 19:40:06,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:40:07,595 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 19:40:07,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:07,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:12,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:15,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:17,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:40:19,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 19:40:19,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 19:40:19,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 19:40:22,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:24,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 19:40:24,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:29,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:40:29,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:40:32,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 19:40:32,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:40:34,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:40:34,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:40:34,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:37,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 19:40:38,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:40,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:40:40,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:40:42,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=468980.0, ans=0.0 2023-09-29 19:40:45,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:40:46,415 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.815e+02 2.067e+02 2.359e+02 2.982e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-29 19:40:46,459 INFO [train.py:1039] (0/4) Epoch 14, batch 1300, loss[loss=0.2008, simple_loss=0.2737, pruned_loss=0.0639, over 23372.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2659, pruned_loss=0.0599, over 4709634.78 frames. ], batch size: 119, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:40:48,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:48,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 19:40:48,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=469046.6666666667, ans=0.0 2023-09-29 19:40:49,290 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.19 vs. limit=15.0 2023-09-29 19:40:54,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:56,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:40:58,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:40:58,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:41:00,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:41:01,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 19:41:06,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:41:08,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:41:09,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 19:41:14,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:41:18,139 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.74 vs. limit=22.5 2023-09-29 19:41:18,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:19,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:22,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:41:24,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:24,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:41:24,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=469180.0, ans=0.0 2023-09-29 19:41:25,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:41:27,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 19:41:27,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=469180.0, ans=0.04949747468305833 2023-09-29 19:41:30,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:41:31,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:41:34,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 19:41:35,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:41:37,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:41:37,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=469246.6666666667, ans=0.1 2023-09-29 19:41:40,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:41:40,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 19:41:40,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:40,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 19:41:43,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:44,222 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=469246.6666666667, ans=0.125 2023-09-29 19:41:47,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:47,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:41:50,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 19:41:50,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 19:41:51,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=469313.3333333333, ans=0.125 2023-09-29 19:41:52,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 19:41:52,781 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=469313.3333333333, ans=0.125 2023-09-29 19:41:57,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:41:57,921 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.49 vs. limit=22.5 2023-09-29 19:41:59,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=469313.3333333333, ans=0.0 2023-09-29 19:41:59,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=469313.3333333333, ans=0.0 2023-09-29 19:41:59,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=469313.3333333333, ans=0.125 2023-09-29 19:42:00,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 19:42:02,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:09,256 INFO [train.py:1039] (0/4) Epoch 14, batch 1350, loss[loss=0.188, simple_loss=0.2759, pruned_loss=0.05002, over 24426.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2648, pruned_loss=0.05902, over 4715878.06 frames. ], batch size: 69, lr: 7.47e-03, grad_scale: 8.0 2023-09-29 19:42:11,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 19:42:14,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:16,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:17,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:17,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:18,681 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.53 vs. limit=15.0 2023-09-29 19:42:19,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=469380.0, ans=0.0 2023-09-29 19:42:20,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:20,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:22,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:42:24,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:24,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=469446.6666666667, ans=0.0 2023-09-29 19:42:27,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:29,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 19:42:30,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:42:30,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:42:33,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=469446.6666666667, ans=0.125 2023-09-29 19:42:35,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 19:42:35,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:42:36,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:42:36,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 19:42:39,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 19:42:41,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 19:42:43,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:43,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 19:42:45,909 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.39 vs. limit=15.0 2023-09-29 19:42:56,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:58,537 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.57 vs. limit=15.0 2023-09-29 19:42:58,569 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.50 vs. limit=15.0 2023-09-29 19:43:02,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=469580.0, ans=0.0 2023-09-29 19:43:06,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:43:06,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:06,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 19:43:11,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:11,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 19:43:12,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:43:12,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:43:14,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:43:16,254 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=469646.6666666667, ans=0.0 2023-09-29 19:43:16,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=469646.6666666667, ans=0.05 2023-09-29 19:43:17,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 19:43:19,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:43:24,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 19:43:25,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 19:43:31,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 19:43:32,807 INFO [train.py:1039] (0/4) Epoch 14, batch 1400, loss[loss=0.1914, simple_loss=0.2699, pruned_loss=0.05643, over 23473.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2632, pruned_loss=0.05891, over 4711610.89 frames. ], batch size: 93, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:43:32,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:34,253 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.861e+02 2.134e+02 2.363e+02 3.336e+02, threshold=4.269e+02, percent-clipped=0.0 2023-09-29 19:43:36,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:43:37,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:43:41,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=469713.3333333333, ans=0.125 2023-09-29 19:43:42,085 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.07 vs. limit=10.0 2023-09-29 19:43:43,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 19:43:44,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 19:43:54,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:43:56,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:43:57,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:43:58,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:43:58,305 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=469780.0, ans=0.05 2023-09-29 19:44:02,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=469780.0, ans=0.125 2023-09-29 19:44:03,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:44:06,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:44:16,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:16,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:21,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 19:44:21,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:44:21,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:44:22,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:44:24,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:44:24,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:44:25,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:44:25,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:44:27,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 19:44:27,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:44:29,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=469913.3333333333, ans=0.1 2023-09-29 19:44:31,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:34,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:44:41,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 19:44:41,829 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.25 vs. limit=6.0 2023-09-29 19:44:42,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:44:44,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:44:46,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:44:48,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:44:49,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:44:52,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:44:53,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:44:53,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:54,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:44:56,253 INFO [train.py:1039] (0/4) Epoch 14, batch 1450, loss[loss=0.2022, simple_loss=0.2848, pruned_loss=0.05985, over 24037.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2627, pruned_loss=0.05858, over 4722165.41 frames. ], batch size: 80, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:44:59,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:00,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:45:01,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:45:01,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 19:45:02,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:45:05,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 19:45:05,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:07,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:07,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 19:45:10,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:10,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:45:12,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 19:45:12,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:13,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:45:15,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:16,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:19,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=470113.3333333333, ans=0.125 2023-09-29 19:45:22,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:45:22,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:45:24,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:24,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:25,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:27,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:45:27,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:28,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:31,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 19:45:31,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=470180.0, ans=0.1 2023-09-29 19:45:34,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:37,558 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 19:45:40,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:45:41,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:45:43,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:43,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=470246.6666666667, ans=0.95 2023-09-29 19:45:44,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 19:45:50,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:51,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 19:45:51,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 19:45:54,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:57,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:45:59,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:00,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 19:46:04,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 19:46:05,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 19:46:06,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=470313.3333333333, ans=0.125 2023-09-29 19:46:07,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:09,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:46:18,268 INFO [train.py:1039] (0/4) Epoch 14, batch 1500, loss[loss=0.1957, simple_loss=0.2786, pruned_loss=0.05636, over 24305.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2632, pruned_loss=0.05888, over 4723327.30 frames. ], batch size: 74, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:46:19,673 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.877e+02 2.089e+02 2.456e+02 3.474e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 19:46:21,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 19:46:21,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:46:21,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:46:22,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:23,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:25,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:46:26,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 19:46:28,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:46:28,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:46:28,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:30,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:46:30,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:46:32,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:38,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:39,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 19:46:39,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:46:39,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:46:40,402 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.99 vs. limit=22.5 2023-09-29 19:46:41,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:44,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 19:46:46,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=470446.6666666667, ans=0.025 2023-09-29 19:46:48,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 19:46:50,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:51,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 19:46:53,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:46:56,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:46:57,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:57,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:59,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 19:47:00,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:47:00,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:01,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 19:47:01,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:03,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=470513.3333333333, ans=0.0 2023-09-29 19:47:07,514 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.86 vs. limit=15.0 2023-09-29 19:47:08,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:47:08,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 19:47:15,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:47:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:47:21,556 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 19:47:22,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:22,972 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 19:47:23,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:24,120 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.24 vs. limit=5.0 2023-09-29 19:47:24,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=470646.6666666667, ans=0.0 2023-09-29 19:47:25,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:47:26,068 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 19:47:27,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:47:29,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 19:47:31,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:33,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:33,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:33,774 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=470646.6666666667, ans=0.125 2023-09-29 19:47:34,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:34,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:36,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:47:36,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=470646.6666666667, ans=0.07 2023-09-29 19:47:37,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 19:47:39,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 19:47:39,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:47:39,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=470713.3333333333, ans=0.125 2023-09-29 19:47:40,989 INFO [train.py:1039] (0/4) Epoch 14, batch 1550, loss[loss=0.2099, simple_loss=0.2763, pruned_loss=0.07178, over 23394.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2646, pruned_loss=0.05961, over 4714053.43 frames. ], batch size: 285, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:47:41,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 19:47:42,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 19:47:44,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:47:46,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:46,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:47:46,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:47:48,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:50,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:53,233 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 19:47:53,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:53,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:47:53,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=470713.3333333333, ans=0.1 2023-09-29 19:47:54,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:47:58,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:47:58,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 19:48:00,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:48:00,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 19:48:02,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 19:48:02,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 19:48:02,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:04,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:08,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:48:10,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 19:48:10,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 19:48:20,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:26,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:48:26,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:48:26,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:48:27,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 19:48:30,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:48:33,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:34,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=470913.3333333333, ans=0.125 2023-09-29 19:48:35,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:48:38,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:48:38,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:38,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=470913.3333333333, ans=0.2 2023-09-29 19:48:39,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 19:48:39,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:48:41,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:48:42,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:44,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 19:48:44,359 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 19:48:47,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:48:49,769 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=470980.0, ans=0.0 2023-09-29 19:48:51,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 19:48:57,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:48:57,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:59,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 19:49:01,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:49:02,598 INFO [train.py:1039] (0/4) Epoch 14, batch 1600, loss[loss=0.1715, simple_loss=0.2458, pruned_loss=0.04865, over 24334.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2646, pruned_loss=0.05934, over 4719553.82 frames. ], batch size: 61, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:49:02,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:02,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:49:02,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:49:02,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:49:04,177 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.865e+02 2.125e+02 2.416e+02 3.474e+02, threshold=4.250e+02, percent-clipped=0.0 2023-09-29 19:49:05,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:07,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 19:49:08,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 19:49:09,471 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.56 vs. limit=15.0 2023-09-29 19:49:10,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 19:49:12,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:13,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 19:49:13,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:49:13,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=471046.6666666667, ans=0.0 2023-09-29 19:49:16,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:49:24,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:49:27,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 19:49:27,899 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=471113.3333333333, ans=0.125 2023-09-29 19:49:30,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:49:32,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 19:49:32,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:32,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 19:49:36,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=471180.0, ans=0.2 2023-09-29 19:49:37,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 19:49:44,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:45,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 19:49:45,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:46,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:46,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:49:48,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 19:49:55,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 19:49:55,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=471246.6666666667, ans=0.025 2023-09-29 19:49:56,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:58,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:58,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:00,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:50:00,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=471246.6666666667, ans=0.125 2023-09-29 19:50:01,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:50:02,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=471246.6666666667, ans=0.0 2023-09-29 19:50:03,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:50:06,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:50:11,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:13,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:50:16,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 19:50:16,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:50:16,603 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.21 vs. limit=22.5 2023-09-29 19:50:17,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 19:50:23,423 INFO [train.py:1039] (0/4) Epoch 14, batch 1650, loss[loss=0.1646, simple_loss=0.2375, pruned_loss=0.04582, over 24266.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2655, pruned_loss=0.05988, over 4708215.54 frames. ], batch size: 56, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:50:23,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:25,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:50:26,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:50:26,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 19:50:26,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 19:50:26,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 19:50:28,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 19:50:30,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:32,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:32,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:50:32,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:50:33,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:37,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 19:50:40,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:50:40,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:40,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:50:40,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:50:40,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=471446.6666666667, ans=0.0 2023-09-29 19:50:42,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 19:50:42,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 19:50:47,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:50:48,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:50:50,775 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=471446.6666666667, ans=0.0 2023-09-29 19:50:56,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 19:50:57,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=471513.3333333333, ans=0.125 2023-09-29 19:50:58,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:50:59,968 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=471513.3333333333, ans=0.125 2023-09-29 19:51:01,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 19:51:03,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:08,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:51:10,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:51:10,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:11,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:51:13,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:14,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:16,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:16,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:16,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:18,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:19,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:51:24,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:25,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 19:51:25,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:27,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 19:51:28,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 19:51:28,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 19:51:28,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:28,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:51:30,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:30,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:30,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 19:51:32,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:34,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:51:36,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:36,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=471646.6666666667, ans=0.1 2023-09-29 19:51:41,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 19:51:44,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:44,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:51:44,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 19:51:46,330 INFO [train.py:1039] (0/4) Epoch 14, batch 1700, loss[loss=0.1856, simple_loss=0.2664, pruned_loss=0.05239, over 24658.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2648, pruned_loss=0.05933, over 4709270.52 frames. ], batch size: 65, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:51:46,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:51:46,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:51:46,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:49,261 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.869e+02 2.042e+02 2.278e+02 4.402e+02, threshold=4.084e+02, percent-clipped=1.0 2023-09-29 19:51:49,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:51:49,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:51:49,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 19:51:54,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:52:04,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:52:06,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:52:11,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:52:13,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:13,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:52:14,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:17,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 19:52:18,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:52:18,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:19,213 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.58 vs. limit=6.0 2023-09-29 19:52:20,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:52:21,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:52:23,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 19:52:24,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 19:52:26,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=471846.6666666667, ans=0.125 2023-09-29 19:52:28,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:29,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 19:52:31,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:52:31,560 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=471846.6666666667, ans=0.0 2023-09-29 19:52:39,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:39,442 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:52:40,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:40,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:43,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:52:43,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 19:52:43,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:47,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:47,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 19:52:49,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:52:49,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:49,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:49,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:52:51,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:51,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:52:53,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:53,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:52:54,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:57,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:00,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 19:53:01,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:03,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:04,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 19:53:07,517 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.72 vs. limit=10.0 2023-09-29 19:53:07,981 INFO [train.py:1039] (0/4) Epoch 14, batch 1750, loss[loss=0.2074, simple_loss=0.2747, pruned_loss=0.07002, over 23368.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2642, pruned_loss=0.05911, over 4719534.26 frames. ], batch size: 105, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:53:11,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:14,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:15,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:53:15,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 19:53:15,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:53:19,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:53:19,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:24,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 19:53:26,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:27,579 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.87 vs. limit=12.0 2023-09-29 19:53:28,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 19:53:28,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:53:28,702 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=472113.3333333333, ans=0.125 2023-09-29 19:53:31,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:53:33,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=472113.3333333333, ans=0.125 2023-09-29 19:53:34,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:53:36,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 19:53:38,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:53:38,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 19:53:46,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:53:50,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:53:50,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:52,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:54,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:56,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:56,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:59,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:53:59,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:54:01,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 19:54:05,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:54:06,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 19:54:07,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:08,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:09,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:54:14,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:54:15,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:54:15,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:18,021 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.59 vs. limit=15.0 2023-09-29 19:54:18,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:20,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=472313.3333333333, ans=0.125 2023-09-29 19:54:23,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:25,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:25,469 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=472313.3333333333, ans=0.125 2023-09-29 19:54:27,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:54:27,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 19:54:27,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:27,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=472313.3333333333, ans=0.0 2023-09-29 19:54:28,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:54:28,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:29,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:54:29,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:54:31,027 INFO [train.py:1039] (0/4) Epoch 14, batch 1800, loss[loss=0.1811, simple_loss=0.2622, pruned_loss=0.05006, over 24458.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2633, pruned_loss=0.05905, over 4712339.64 frames. ], batch size: 66, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:54:31,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:54:34,554 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.007e+02 2.286e+02 2.732e+02 4.452e+02, threshold=4.572e+02, percent-clipped=3.0 2023-09-29 19:54:34,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:54:34,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:35,716 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.70 vs. limit=22.5 2023-09-29 19:54:36,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:54:41,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:42,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 19:54:43,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=472380.0, ans=0.125 2023-09-29 19:54:44,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:54:47,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:54:50,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:51,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:53,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:54:54,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:54,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 19:54:55,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:54:59,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:06,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 19:55:06,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 19:55:08,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 19:55:08,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:10,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:55:10,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:10,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:55:15,654 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 19:55:17,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:55:20,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:20,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=472580.0, ans=0.1 2023-09-29 19:55:21,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 19:55:21,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 19:55:21,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:55:23,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:55:24,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:55:29,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 19:55:29,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=472580.0, ans=0.2 2023-09-29 19:55:36,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:55:36,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 19:55:37,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:55:37,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:38,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:55:39,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 19:55:45,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:55:45,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:55:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 19:55:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:51,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:51,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:55:51,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:52,895 INFO [train.py:1039] (0/4) Epoch 14, batch 1850, loss[loss=0.2091, simple_loss=0.2798, pruned_loss=0.06917, over 23348.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2636, pruned_loss=0.05933, over 4706121.65 frames. ], batch size: 93, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:55:53,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:53,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:55:54,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:56,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:57,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:55:59,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:05,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:56:05,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 19:56:10,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 19:56:12,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 19:56:14,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=472780.0, ans=0.125 2023-09-29 19:56:17,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:18,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 19:56:18,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:56:22,394 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.94 vs. limit=12.0 2023-09-29 19:56:23,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=472780.0, ans=0.0 2023-09-29 19:56:29,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:56:30,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 19:56:34,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:56:34,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:56:34,835 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.98 vs. limit=15.0 2023-09-29 19:56:37,273 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=472846.6666666667, ans=0.1 2023-09-29 19:56:39,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 19:56:39,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:41,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:56:43,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:56:46,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:48,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:56:51,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:56:53,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:53,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:56:53,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:53,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=472913.3333333333, ans=0.125 2023-09-29 19:56:55,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:56:57,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:57:00,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 19:57:01,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:57:04,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:57:06,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:57:06,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 19:57:06,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 19:57:08,002 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 19:57:09,544 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 19:57:11,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:57:11,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:57:11,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:12,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:12,706 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 19:57:12,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:57:14,121 INFO [train.py:1039] (0/4) Epoch 14, batch 1900, loss[loss=0.2078, simple_loss=0.2688, pruned_loss=0.07343, over 23496.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2644, pruned_loss=0.05959, over 4715884.56 frames. ], batch size: 120, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:57:14,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:14,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:57:15,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:57:17,954 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.953e+02 2.438e+02 3.060e+02 4.986e+02, threshold=4.875e+02, percent-clipped=3.0 2023-09-29 19:57:18,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:57:18,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 19:57:19,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:19,711 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 19:57:19,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:57:21,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:27,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:29,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:57:29,999 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=473113.3333333333, ans=0.1 2023-09-29 19:57:32,174 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.29 vs. limit=15.0 2023-09-29 19:57:32,994 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 19:57:33,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 19:57:34,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:36,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:57:36,087 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 19:57:36,140 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 19:57:41,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 19:57:43,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:57:46,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 19:57:48,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 19:57:58,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 19:57:58,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=473180.0, ans=0.125 2023-09-29 19:58:01,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 19:58:01,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:01,462 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 19:58:01,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 19:58:01,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 19:58:03,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 19:58:03,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:08,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 19:58:11,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:58:13,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:13,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 19:58:16,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:58:19,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 19:58:19,776 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=473313.3333333333, ans=0.2 2023-09-29 19:58:20,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:27,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:58:27,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:58:27,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:58:27,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:58:29,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:58:29,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 19:58:30,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:58:33,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:33,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:58:35,811 INFO [train.py:1039] (0/4) Epoch 14, batch 1950, loss[loss=0.2056, simple_loss=0.2647, pruned_loss=0.07322, over 23768.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2643, pruned_loss=0.05897, over 4727127.88 frames. ], batch size: 179, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:58:36,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:58:36,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:36,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:37,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:42,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:58:44,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:58:45,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:45,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:58:47,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 19:58:48,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:58:48,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:50,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:53,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:58:53,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:58:54,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:57,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:58:59,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:58:59,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:59:01,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:59:01,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:06,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:09,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:59:09,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:09,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:59:09,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 19:59:11,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:59:11,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:59:11,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:14,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:17,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:59:23,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:59:26,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:59:26,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:59:27,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 19:59:27,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:59:32,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:59:33,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:59:33,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:59:40,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=473646.6666666667, ans=0.1 2023-09-29 19:59:43,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:45,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:47,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:49,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:52,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:59:54,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:54,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 19:59:54,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:59:55,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:55,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 19:59:55,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=473713.3333333333, ans=0.125 2023-09-29 19:59:57,171 INFO [train.py:1039] (0/4) Epoch 14, batch 2000, loss[loss=0.1756, simple_loss=0.2523, pruned_loss=0.04944, over 21144.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2652, pruned_loss=0.05985, over 4721846.87 frames. ], batch size: 46, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 19:59:58,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:00,435 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.917e+02 2.206e+02 2.573e+02 3.762e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 20:00:00,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:00:02,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:00:02,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:03,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:00:06,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:08,765 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.31 vs. limit=15.0 2023-09-29 20:00:11,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 20:00:11,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:00:16,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:00:17,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 20:00:19,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:00:19,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:21,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:00:24,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 20:00:26,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:28,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 20:00:29,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:00:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 20:00:31,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:34,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:00:34,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:00:34,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:35,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:37,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:38,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 20:00:40,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 20:00:40,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:40,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:00:44,180 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=473913.3333333333, ans=0.1 2023-09-29 20:00:45,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:46,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:00:46,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:47,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:49,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:51,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:51,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:51,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:53,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:56,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:57,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 20:01:04,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:01:05,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:01:10,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=473980.0, ans=0.1 2023-09-29 20:01:13,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:15,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:15,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:15,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:01:15,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:01:18,022 INFO [train.py:1039] (0/4) Epoch 14, batch 2050, loss[loss=0.2065, simple_loss=0.2495, pruned_loss=0.08172, over 19391.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2648, pruned_loss=0.06015, over 4709297.04 frames. ], batch size: 388, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:01:18,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:19,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:25,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:25,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:30,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:01:33,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:01:35,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:35,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:01:36,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 20:01:36,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:01:37,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:01:38,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:01:48,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:48,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:49,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 20:01:53,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:53,797 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=474180.0, ans=0.0 2023-09-29 20:01:54,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 20:01:55,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:58,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:01,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:03,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:02:03,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:05,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:02:06,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:02:06,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:02:08,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:10,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:02:13,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:02:13,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:02:17,654 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.67 vs. limit=22.5 2023-09-29 20:02:18,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:22,390 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.75 vs. limit=6.0 2023-09-29 20:02:22,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:02:22,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 20:02:29,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:29,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=474313.3333333333, ans=0.125 2023-09-29 20:02:30,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:02:32,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:02:33,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 20:02:38,872 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 20:02:38,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:38,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:40,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:02:40,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:41,986 INFO [train.py:1039] (0/4) Epoch 14, batch 2100, loss[loss=0.188, simple_loss=0.2741, pruned_loss=0.05094, over 24424.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2641, pruned_loss=0.05984, over 4710636.15 frames. ], batch size: 69, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:02:42,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 20:02:42,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 20:02:43,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:45,132 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.952e+02 2.197e+02 2.435e+02 3.188e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 20:02:46,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:02:48,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:02:50,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:51,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:02:51,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 20:02:52,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:02:54,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 20:02:54,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 20:02:56,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:02:56,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:02:56,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 20:02:58,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:03:05,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 20:03:05,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:03:08,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:03:10,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:03:14,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:03:15,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 20:03:15,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:15,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 20:03:18,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 20:03:18,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:18,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 20:03:18,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 20:03:20,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 20:03:21,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:03:23,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:03:25,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=474513.3333333333, ans=0.04949747468305833 2023-09-29 20:03:26,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:26,666 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=474513.3333333333, ans=0.95 2023-09-29 20:03:27,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:29,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:31,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:31,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 20:03:31,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:31,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:32,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:32,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 20:03:33,609 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.37 vs. limit=15.0 2023-09-29 20:03:36,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 20:03:36,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 20:03:41,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:03:44,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:03:44,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 20:03:47,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=474646.6666666667, ans=0.0 2023-09-29 20:03:48,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=474646.6666666667, ans=0.0 2023-09-29 20:03:49,366 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.48 vs. limit=22.5 2023-09-29 20:03:50,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:53,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:03:53,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:03:53,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:03:53,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 20:03:53,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:03:57,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:57,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:03:57,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:03:57,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:59,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 20:04:01,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 20:04:01,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:02,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:02,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:04:04,275 INFO [train.py:1039] (0/4) Epoch 14, batch 2150, loss[loss=0.1959, simple_loss=0.2803, pruned_loss=0.05577, over 24579.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2629, pruned_loss=0.05935, over 4716237.39 frames. ], batch size: 71, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:04:04,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:04:04,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:04:13,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 20:04:14,145 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.16 vs. limit=15.0 2023-09-29 20:04:15,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:15,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:17,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:04:18,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:18,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:04:22,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:23,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:04:23,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:04:26,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:28,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 20:04:32,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:34,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:04:37,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:37,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:04:38,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:38,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:04:38,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:04:40,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 20:04:42,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:04:42,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:42,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:44,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:04:46,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:04:48,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:50,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:04:51,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:51,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 20:04:51,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:04:51,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=474846.6666666667, ans=0.125 2023-09-29 20:04:53,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:53,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:55,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:56,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:04:58,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:59,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:59,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 20:05:02,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 20:05:02,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:05:02,892 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 20:05:04,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:04,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:05:05,174 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.10 vs. limit=15.0 2023-09-29 20:05:05,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 20:05:05,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:05:05,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 20:05:05,962 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 20:05:05,962 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 20:05:06,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 20:05:07,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:09,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:05:09,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:05:10,072 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.42 vs. limit=22.5 2023-09-29 20:05:10,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:12,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:05:13,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:15,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:24,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:05:25,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 20:05:27,332 INFO [train.py:1039] (0/4) Epoch 14, batch 2200, loss[loss=0.1738, simple_loss=0.2625, pruned_loss=0.04258, over 24297.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2628, pruned_loss=0.05916, over 4721244.82 frames. ], batch size: 74, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:05:29,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:05:30,433 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.895e+02 2.112e+02 2.594e+02 4.631e+02, threshold=4.225e+02, percent-clipped=1.0 2023-09-29 20:05:30,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=475046.6666666667, ans=0.125 2023-09-29 20:05:35,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:35,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:05:37,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:05:37,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:05:40,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:42,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:05:42,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 20:05:46,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 20:05:48,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:05:53,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 20:05:55,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=475113.3333333333, ans=0.125 2023-09-29 20:05:57,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:58,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:00,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:06:03,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:06:03,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 20:06:05,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:06:07,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:07,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:06:09,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=475180.0, ans=0.0 2023-09-29 20:06:12,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:06:13,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:15,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:06:16,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:18,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 20:06:19,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:23,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 20:06:26,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:26,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:06:26,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:29,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:29,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:29,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:29,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:30,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:06:32,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:06:32,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:06:35,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 20:06:35,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:06:38,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:06:40,827 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 20:06:41,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:06:42,474 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 20:06:43,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:06:44,068 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 20:06:45,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:47,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:06:49,983 INFO [train.py:1039] (0/4) Epoch 14, batch 2250, loss[loss=0.1732, simple_loss=0.2508, pruned_loss=0.04777, over 24482.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.263, pruned_loss=0.05912, over 4716105.76 frames. ], batch size: 63, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:06:50,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:51,636 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 20:06:53,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:06:56,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:06:58,256 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=475380.0, ans=0.1 2023-09-29 20:07:04,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:07:05,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:07:07,753 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=475446.6666666667, ans=0.125 2023-09-29 20:07:09,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:09,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:10,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:07:12,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 20:07:12,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:12,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:07:15,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 20:07:17,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:07:17,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:19,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:23,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:24,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:07:24,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:07:27,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 20:07:28,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:28,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=475513.3333333333, ans=0.125 2023-09-29 20:07:30,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:07:35,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:35,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=475513.3333333333, ans=0.1 2023-09-29 20:07:37,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:37,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:07:37,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:40,932 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=475580.0, ans=0.1 2023-09-29 20:07:42,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:43,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:07:48,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:07:49,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:07:53,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:07:55,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:07:55,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:08:02,049 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.26 vs. limit=15.0 2023-09-29 20:08:02,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:08:04,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:08:04,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 20:08:04,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:05,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:08:08,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 20:08:12,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:08:12,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:13,506 INFO [train.py:1039] (0/4) Epoch 14, batch 2300, loss[loss=0.1963, simple_loss=0.2743, pruned_loss=0.05916, over 23669.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2637, pruned_loss=0.05873, over 4715083.06 frames. ], batch size: 85, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:08:16,571 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.985e+02 2.281e+02 2.632e+02 4.053e+02, threshold=4.563e+02, percent-clipped=0.0 2023-09-29 20:08:17,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=475713.3333333333, ans=0.125 2023-09-29 20:08:18,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:19,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:08:21,402 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 20:08:23,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:30,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:08:30,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:08:30,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:08:30,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:30,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 20:08:33,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:08:34,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:08:36,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:08:41,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:08:45,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:08:49,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:08:52,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:08:54,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:57,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:08:58,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:59,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=475846.6666666667, ans=0.2 2023-09-29 20:09:04,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:09:04,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=475913.3333333333, ans=0.125 2023-09-29 20:09:05,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:09:05,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:09:05,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 20:09:05,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=475913.3333333333, ans=0.0 2023-09-29 20:09:10,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:09:10,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:10,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:10,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:09:11,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:11,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:09:11,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:09:12,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 20:09:13,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:09:13,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:13,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 20:09:21,713 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.24 vs. limit=6.0 2023-09-29 20:09:22,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:09:27,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:09:31,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:31,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:09:31,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:09:33,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:09:33,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:09:34,818 INFO [train.py:1039] (0/4) Epoch 14, batch 2350, loss[loss=0.2094, simple_loss=0.277, pruned_loss=0.07089, over 23442.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2645, pruned_loss=0.05921, over 4721405.18 frames. ], batch size: 120, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:09:34,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:09:36,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 20:09:40,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:09:40,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 20:09:46,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 20:09:46,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=476046.6666666667, ans=0.0 2023-09-29 20:09:49,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:50,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=476113.3333333333, ans=0.04949747468305833 2023-09-29 20:09:54,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=476113.3333333333, ans=0.2 2023-09-29 20:09:55,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:09:55,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:09:55,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 20:10:00,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:10:07,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 20:10:08,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=476180.0, ans=0.04949747468305833 2023-09-29 20:10:09,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:10:10,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:10:10,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:10:15,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:10:17,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 20:10:17,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:10:19,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:10:20,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:20,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:10:25,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:10:26,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 20:10:28,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:10:30,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:10:30,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:10:32,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 20:10:34,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:10:35,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 20:10:37,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:10:42,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 20:10:43,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 20:10:45,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:45,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:10:45,382 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 20:10:46,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 20:10:48,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 20:10:52,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:10:55,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:10:56,572 INFO [train.py:1039] (0/4) Epoch 14, batch 2400, loss[loss=0.2126, simple_loss=0.268, pruned_loss=0.07855, over 23784.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2644, pruned_loss=0.05958, over 4719588.29 frames. ], batch size: 212, lr: 7.41e-03, grad_scale: 32.0 2023-09-29 20:10:59,957 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.908e+02 2.123e+02 2.498e+02 3.353e+02, threshold=4.247e+02, percent-clipped=0.0 2023-09-29 20:11:00,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:11:01,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:11:01,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 20:11:03,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 20:11:10,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=476380.0, ans=0.0 2023-09-29 20:11:11,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:11:11,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:11:13,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 20:11:14,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:11:16,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:16,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 20:11:22,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:24,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 20:11:24,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=476446.6666666667, ans=0.0 2023-09-29 20:11:29,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:11:34,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=476513.3333333333, ans=0.125 2023-09-29 20:11:35,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 20:11:39,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:11:40,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:45,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:11:47,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 20:11:48,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:11:53,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:11:53,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=476580.0, ans=0.1 2023-09-29 20:11:56,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:12:01,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:01,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:12:01,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:12:01,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:12:03,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:03,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:03,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:12:08,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:08,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:12:08,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 20:12:10,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 20:12:12,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:12:12,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:13,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 20:12:15,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 20:12:15,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 20:12:15,436 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 20:12:16,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 20:12:18,292 INFO [train.py:1039] (0/4) Epoch 14, batch 2450, loss[loss=0.1621, simple_loss=0.2368, pruned_loss=0.04372, over 24327.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.263, pruned_loss=0.05865, over 4724503.63 frames. ], batch size: 56, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:12:18,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:12:19,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:19,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:21,461 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 20:12:23,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:23,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:12:24,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:12:24,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:28,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:28,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:29,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 20:12:36,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:12:36,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:37,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:12:39,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:12:39,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:12:39,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 20:12:46,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:47,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:12:47,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:48,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=476780.0, ans=0.1 2023-09-29 20:12:52,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:12:52,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:52,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:53,700 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.09 vs. limit=15.0 2023-09-29 20:12:54,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:54,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 20:12:55,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:13:03,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:05,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:13:05,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:05,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:13:07,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:07,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:13:08,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 20:13:14,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:13:14,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:13:17,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:13:17,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:22,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:13:22,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 20:13:23,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:13:25,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:13:25,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 20:13:25,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:13:26,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:13:27,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=476980.0, ans=0.0 2023-09-29 20:13:31,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:13:31,694 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=476980.0, ans=0.0 2023-09-29 20:13:32,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:32,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:13:37,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 20:13:38,877 INFO [train.py:1039] (0/4) Epoch 14, batch 2500, loss[loss=0.207, simple_loss=0.2649, pruned_loss=0.07462, over 22828.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2611, pruned_loss=0.05798, over 4717700.63 frames. ], batch size: 322, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:13:39,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:13:44,482 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.907e+02 2.163e+02 2.456e+02 3.959e+02, threshold=4.326e+02, percent-clipped=0.0 2023-09-29 20:13:45,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=477046.6666666667, ans=0.125 2023-09-29 20:13:48,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:13:58,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:13:58,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:14:00,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:14:00,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 20:14:01,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=477113.3333333333, ans=0.0 2023-09-29 20:14:04,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:14:05,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:06,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:14:06,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:14:08,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 20:14:09,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:10,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:10,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 20:14:10,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:12,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 20:14:12,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:16,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:14:18,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:18,968 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.10 vs. limit=22.5 2023-09-29 20:14:23,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:14:25,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 20:14:27,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:14:30,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:33,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:36,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:39,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:43,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:14:47,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 20:14:47,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:47,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:14:47,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=477313.3333333333, ans=0.125 2023-09-29 20:14:48,016 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.96 vs. limit=6.0 2023-09-29 20:14:50,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:14:50,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:14:52,850 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 20:14:52,851 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 20:14:52,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 20:14:54,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:58,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 20:14:58,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 20:14:59,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:59,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 20:15:02,399 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.64 vs. limit=15.0 2023-09-29 20:15:03,181 INFO [train.py:1039] (0/4) Epoch 14, batch 2550, loss[loss=0.2076, simple_loss=0.2762, pruned_loss=0.06956, over 23715.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2626, pruned_loss=0.05881, over 4716204.00 frames. ], batch size: 232, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:15:04,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 20:15:07,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:09,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:15:09,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:15:12,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:12,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 20:15:12,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:15:15,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 20:15:17,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:15:19,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:22,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:15:22,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 20:15:23,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:25,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:26,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:29,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:15:29,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 20:15:30,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:15:30,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:30,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 20:15:45,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:15:48,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:15:48,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:48,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:50,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:15:56,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:58,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:58,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:16:00,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:16:00,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:16:01,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:16:02,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:04,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:08,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:16:09,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 20:16:09,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:16:10,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:11,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:16:12,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:16:13,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:20,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:16:22,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:24,921 INFO [train.py:1039] (0/4) Epoch 14, batch 2600, loss[loss=0.2077, simple_loss=0.2699, pruned_loss=0.07278, over 23777.00 frames. ], tot_loss[loss=0.1913, simple_loss=0.264, pruned_loss=0.05931, over 4706531.32 frames. ], batch size: 212, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:16:25,133 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 20:16:28,246 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 20:16:28,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:16:28,346 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 20:16:29,682 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.879e+02 2.138e+02 2.500e+02 3.129e+02, threshold=4.275e+02, percent-clipped=0.0 2023-09-29 20:16:29,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 20:16:29,864 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 20:16:32,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:32,205 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 20:16:35,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 20:16:37,663 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 20:16:40,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:16:44,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 20:16:45,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 20:16:47,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:16:47,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 20:16:50,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 20:16:50,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 20:16:51,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=477780.0, ans=0.07 2023-09-29 20:16:54,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=477780.0, ans=0.0 2023-09-29 20:16:57,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:16:57,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:57,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:16:57,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 20:16:58,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:17:03,647 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 20:17:11,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:11,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:13,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 20:17:14,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:14,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:17:14,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 20:17:17,602 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.95 vs. limit=22.5 2023-09-29 20:17:19,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:17:19,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:17:21,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:24,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 20:17:25,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:26,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:17:29,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=477913.3333333333, ans=0.1 2023-09-29 20:17:31,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:32,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:17:32,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 20:17:32,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=477980.0, ans=0.125 2023-09-29 20:17:33,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:36,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:17:36,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:17:41,475 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=477980.0, ans=0.125 2023-09-29 20:17:41,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=477980.0, ans=0.125 2023-09-29 20:17:44,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 20:17:46,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:47,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=478046.6666666667, ans=0.0 2023-09-29 20:17:49,000 INFO [train.py:1039] (0/4) Epoch 14, batch 2650, loss[loss=0.1757, simple_loss=0.2442, pruned_loss=0.05358, over 24359.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2642, pruned_loss=0.05931, over 4710243.79 frames. ], batch size: 56, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:17:50,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:17:54,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 20:17:54,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=478046.6666666667, ans=0.125 2023-09-29 20:17:55,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:57,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:17:58,747 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 20:17:58,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:01,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:03,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:18:04,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:18:07,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:18:09,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 20:18:09,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:18:10,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:18:12,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 20:18:12,746 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 20:18:12,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=478113.3333333333, ans=0.125 2023-09-29 20:18:17,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:18,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 20:18:18,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:20,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 20:18:24,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:24,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:18:24,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:24,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:31,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 20:18:31,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 20:18:34,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:18:36,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=478246.6666666667, ans=0.125 2023-09-29 20:18:39,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 20:18:39,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:39,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=478246.6666666667, ans=0.125 2023-09-29 20:18:40,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:40,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:18:40,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:42,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:43,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:45,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=478246.6666666667, ans=0.125 2023-09-29 20:18:46,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:47,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:47,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:18:48,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:18:50,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:50,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:18:50,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:53,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:53,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:18:58,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:00,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:19:00,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:02,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 20:19:04,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=478313.3333333333, ans=0.07 2023-09-29 20:19:05,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:07,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:07,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:07,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=478313.3333333333, ans=0.125 2023-09-29 20:19:08,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:10,264 INFO [train.py:1039] (0/4) Epoch 14, batch 2700, loss[loss=0.1953, simple_loss=0.274, pruned_loss=0.0583, over 24023.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2654, pruned_loss=0.05963, over 4718236.93 frames. ], batch size: 80, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:19:10,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:19:10,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:13,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:13,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 20:19:14,658 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.969e+02 2.229e+02 2.662e+02 4.082e+02, threshold=4.458e+02, percent-clipped=0.0 2023-09-29 20:19:15,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=478380.0, ans=0.125 2023-09-29 20:19:16,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:19:18,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:19:19,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:19:21,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:21,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:22,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:19:22,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:23,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:19:23,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:19:23,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 20:19:25,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:19:25,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:19:27,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:19:28,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:32,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:19:32,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 20:19:33,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:19:35,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=478446.6666666667, ans=0.125 2023-09-29 20:19:38,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:19:38,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:19:44,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:19:44,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:44,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:19:44,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:19:47,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:19:49,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:49,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:19:49,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:19:56,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:56,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:20:03,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=478580.0, ans=0.125 2023-09-29 20:20:06,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:20:08,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:12,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:20:12,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:15,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:16,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:17,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:20:17,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=478646.6666666667, ans=0.1 2023-09-29 20:20:18,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:20,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:20,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:23,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:20:25,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:25,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:26,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 20:20:26,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:29,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:20:30,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 20:20:31,470 INFO [train.py:1039] (0/4) Epoch 14, batch 2750, loss[loss=0.1948, simple_loss=0.2721, pruned_loss=0.05878, over 24069.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2659, pruned_loss=0.05966, over 4710925.55 frames. ], batch size: 80, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:20:31,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 20:20:31,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:35,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:35,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:37,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:39,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:20:39,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:44,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:20:44,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=478713.3333333333, ans=0.1 2023-09-29 20:20:46,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:20:46,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:20:46,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:46,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 20:20:46,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:20:46,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:47,040 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.39 vs. limit=15.0 2023-09-29 20:20:51,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=478780.0, ans=0.1 2023-09-29 20:20:52,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 20:20:54,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:55,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:55,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:55,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:20:55,916 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=478780.0, ans=0.07 2023-09-29 20:20:57,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:57,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:20:57,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:57,687 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=478780.0, ans=0.0 2023-09-29 20:20:58,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:01,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=478780.0, ans=0.1 2023-09-29 20:21:03,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:21:03,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:21:05,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:21:06,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:08,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:21:17,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:18,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:21:18,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:22,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:22,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:21:24,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:21:24,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=478913.3333333333, ans=0.0 2023-09-29 20:21:31,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:21:31,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:21:31,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 20:21:36,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:37,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 20:21:43,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:21:45,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:21:46,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 20:21:47,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:21:49,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:21:49,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 20:21:49,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:21:53,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:21:54,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:21:54,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:21:56,022 INFO [train.py:1039] (0/4) Epoch 14, batch 2800, loss[loss=0.1957, simple_loss=0.2636, pruned_loss=0.06392, over 23650.00 frames. ], tot_loss[loss=0.191, simple_loss=0.264, pruned_loss=0.05895, over 4704755.85 frames. ], batch size: 149, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:21:56,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 20:21:56,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:21:57,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:59,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:00,649 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 20:22:00,650 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 20:22:03,503 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.993e+02 2.230e+02 2.590e+02 3.913e+02, threshold=4.460e+02, percent-clipped=0.0 2023-09-29 20:22:03,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=479046.6666666667, ans=0.1 2023-09-29 20:22:05,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:06,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:22:06,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:22:10,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:22:11,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 20:22:15,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:22:16,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 20:22:16,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:18,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:22:18,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:24,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:24,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=479113.3333333333, ans=0.125 2023-09-29 20:22:25,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:25,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:22:25,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:22:34,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:22:35,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:38,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:38,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:22:40,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:45,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:22:45,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 20:22:45,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=479246.6666666667, ans=0.1 2023-09-29 20:22:46,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:47,358 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=479246.6666666667, ans=0.02 2023-09-29 20:22:48,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:48,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:22:51,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:53,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:57,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:23:00,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:23:00,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:00,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:23:01,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:23:02,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:23:02,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:23:03,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 20:23:04,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:04,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:23:05,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:07,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 20:23:07,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:08,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:23:08,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:23:10,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 20:23:16,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:23:16,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:23:16,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:23:18,209 INFO [train.py:1039] (0/4) Epoch 14, batch 2850, loss[loss=0.1986, simple_loss=0.2649, pruned_loss=0.06613, over 23395.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2634, pruned_loss=0.05869, over 4707691.79 frames. ], batch size: 285, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:23:19,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:23,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:23:24,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:23:24,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:23:28,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:28,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:30,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:23:30,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=479380.0, ans=0.0 2023-09-29 20:23:32,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 20:23:33,144 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.44 vs. limit=22.5 2023-09-29 20:23:36,069 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.61 vs. limit=10.0 2023-09-29 20:23:38,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 20:23:38,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:40,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 20:23:40,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:44,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 20:23:46,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 20:23:47,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:49,775 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=479513.3333333333, ans=0.125 2023-09-29 20:23:50,368 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-09-29 20:23:58,529 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:23:59,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:01,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:01,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:24:02,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:24:02,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:24:02,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:24:05,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:24:05,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 20:24:08,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:24:10,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:10,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:10,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:13,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:14,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:15,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:17,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:18,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:24:20,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:21,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:24,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:24:27,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:24:29,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=479646.6666666667, ans=0.07 2023-09-29 20:24:31,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 20:24:31,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 20:24:33,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:24:34,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:34,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 20:24:34,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:24:36,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:36,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:36,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:24:36,373 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 20:24:36,432 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 20:24:36,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:24:38,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:41,446 INFO [train.py:1039] (0/4) Epoch 14, batch 2900, loss[loss=0.1684, simple_loss=0.2522, pruned_loss=0.04233, over 24458.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2625, pruned_loss=0.05828, over 4706245.90 frames. ], batch size: 66, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:24:45,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:24:45,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:46,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:46,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 20:24:50,315 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.184e+02 2.656e+02 3.783e+02, threshold=4.367e+02, percent-clipped=0.0 2023-09-29 20:24:52,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:52,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 20:24:53,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 20:24:54,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:24:54,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:24:58,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:59,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:25:02,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:25:02,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:25:04,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=479780.0, ans=0.2 2023-09-29 20:25:06,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:25:06,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 20:25:07,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:25:09,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:13,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 20:25:14,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 20:25:17,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:25:17,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 20:25:17,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:25:18,076 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=479846.6666666667, ans=0.2 2023-09-29 20:25:21,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:25:21,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:25:22,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:25:23,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=479846.6666666667, ans=0.0 2023-09-29 20:25:24,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:27,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:25:30,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:25:32,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 20:25:32,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 20:25:32,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:25:37,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:25:39,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 20:25:42,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:25:47,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:50,008 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.25 vs. limit=15.0 2023-09-29 20:25:50,797 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-72000.pt 2023-09-29 20:25:59,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:25:59,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:26:01,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 20:26:05,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:05,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 20:26:06,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:06,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:26:07,891 INFO [train.py:1039] (0/4) Epoch 14, batch 2950, loss[loss=0.2005, simple_loss=0.2673, pruned_loss=0.0668, over 23621.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2631, pruned_loss=0.05831, over 4711992.79 frames. ], batch size: 256, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:26:08,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=480046.6666666667, ans=0.125 2023-09-29 20:26:11,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:12,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 20:26:14,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:14,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:14,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:26:16,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:26:18,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 20:26:18,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 20:26:21,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:26:21,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:29,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:31,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:26:33,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:26:34,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:36,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=480113.3333333333, ans=0.125 2023-09-29 20:26:38,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:26:38,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:26:39,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:26:42,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 20:26:47,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 20:26:48,902 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 20:26:50,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:26:52,808 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 20:26:54,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 20:26:54,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:54,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:54,619 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 20:26:54,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:26:57,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 20:26:58,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=480246.6666666667, ans=0.2 2023-09-29 20:27:01,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:27:01,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:27:04,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:06,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:27:06,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:06,334 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 20:27:08,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:08,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 20:27:13,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:13,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:14,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=480313.3333333333, ans=0.125 2023-09-29 20:27:15,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 20:27:15,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:27:15,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 20:27:18,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:21,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:27:21,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:27:23,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:23,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:27:24,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:27:24,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:24,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:27:24,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:27:25,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=480313.3333333333, ans=0.0 2023-09-29 20:27:26,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:28,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:27:28,996 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.26 vs. limit=12.0 2023-09-29 20:27:29,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:29,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 20:27:31,374 INFO [train.py:1039] (0/4) Epoch 14, batch 3000, loss[loss=0.2379, simple_loss=0.2877, pruned_loss=0.09408, over 18880.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2636, pruned_loss=0.05914, over 4709328.70 frames. ], batch size: 388, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:27:31,376 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 20:27:42,351 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.9230, 3.7644, 2.8715, 3.0340, 3.1844, 3.5517, 2.5391, 3.6395], device='cuda:0') 2023-09-29 20:27:46,276 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.6956, 4.3682, 4.1664, 3.9726], device='cuda:0') 2023-09-29 20:27:46,675 INFO [train.py:1071] (0/4) Epoch 14, validation: loss=0.2839, simple_loss=0.2749, pruned_loss=0.1465, over 1125622.00 frames. 2023-09-29 20:27:46,675 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-29 20:27:46,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:47,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=480380.0, ans=0.2 2023-09-29 20:27:48,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:27:49,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:27:51,633 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 20:27:53,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 20:27:54,617 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.901e+02 2.128e+02 2.266e+02 3.715e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 20:27:54,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:55,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=480380.0, ans=0.125 2023-09-29 20:27:56,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:27:56,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 20:27:56,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:28:04,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:28:11,706 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-09-29 20:28:17,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:28:22,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 20:28:23,690 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.72 vs. limit=15.0 2023-09-29 20:28:24,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:28:26,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:28:26,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:28:26,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:28:29,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:29,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 20:28:29,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=480513.3333333333, ans=0.0 2023-09-29 20:28:32,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 20:28:33,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:28:35,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:28:37,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:28:38,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:38,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:38,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:28:41,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:28:42,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:42,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:28:44,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:47,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 20:28:49,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:28:50,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:28:50,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:28:53,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:53,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:55,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:28:56,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 20:28:56,578 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=14.36 vs. limit=15.0 2023-09-29 20:28:57,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:28:57,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 20:28:57,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:28:58,394 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.52 vs. limit=15.0 2023-09-29 20:28:59,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 20:29:02,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:03,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:29:03,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 20:29:05,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 20:29:05,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:29:05,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:29:06,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:29:08,214 INFO [train.py:1039] (0/4) Epoch 14, batch 3050, loss[loss=0.174, simple_loss=0.2585, pruned_loss=0.04478, over 24565.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2644, pruned_loss=0.05929, over 4712820.67 frames. ], batch size: 71, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:29:08,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:29:08,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:08,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:29:13,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 20:29:15,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:18,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:18,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:29:20,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=480713.3333333333, ans=0.0 2023-09-29 20:29:23,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:27,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 20:29:30,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 20:29:33,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 20:29:33,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:36,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:29:37,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:37,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:39,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:44,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:29:44,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:44,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:46,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:46,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:48,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:49,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:52,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:52,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 20:29:54,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:54,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:29:55,651 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.16 vs. limit=15.0 2023-09-29 20:29:57,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:57,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:29:59,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:29:59,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:07,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:30:07,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:09,784 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.57 vs. limit=15.0 2023-09-29 20:30:13,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:13,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:30:13,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:30:16,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:18,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:30:18,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:30:20,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 20:30:22,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:22,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:22,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 20:30:25,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:30,490 INFO [train.py:1039] (0/4) Epoch 14, batch 3100, loss[loss=0.2124, simple_loss=0.2691, pruned_loss=0.07789, over 23802.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2642, pruned_loss=0.05935, over 4711341.45 frames. ], batch size: 212, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:30:32,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:34,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:30:34,515 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=481046.6666666667, ans=0.125 2023-09-29 20:30:35,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:30:38,647 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.952e+02 2.226e+02 2.517e+02 3.865e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 20:30:38,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 20:30:41,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 20:30:43,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 20:30:43,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:30:44,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=481046.6666666667, ans=0.125 2023-09-29 20:30:45,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:30:45,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:50,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:30:54,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:56,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=481113.3333333333, ans=0.125 2023-09-29 20:30:59,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 20:30:59,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=481113.3333333333, ans=0.0 2023-09-29 20:31:04,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=481180.0, ans=0.025 2023-09-29 20:31:06,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:31:07,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:07,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:07,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:08,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:31:09,134 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=481180.0, ans=0.0 2023-09-29 20:31:10,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:31:10,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 20:31:10,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:31:11,514 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.27 vs. limit=6.0 2023-09-29 20:31:12,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:13,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 20:31:15,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:31:18,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:31:20,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 20:31:20,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 20:31:21,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:23,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:25,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:25,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:26,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:31:26,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:31:26,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:31:28,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:31:30,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:31:30,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:30,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 20:31:35,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:37,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 20:31:39,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:31:39,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 20:31:41,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:41,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=481313.3333333333, ans=0.125 2023-09-29 20:31:42,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:42,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 20:31:53,120 INFO [train.py:1039] (0/4) Epoch 14, batch 3150, loss[loss=0.191, simple_loss=0.2327, pruned_loss=0.0747, over 19658.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2634, pruned_loss=0.05892, over 4721509.08 frames. ], batch size: 388, lr: 7.37e-03, grad_scale: 8.0 2023-09-29 20:31:53,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 20:31:54,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=481380.0, ans=0.0 2023-09-29 20:31:54,555 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.02 vs. limit=10.0 2023-09-29 20:31:56,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:31:57,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:58,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:58,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:32:00,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 20:32:02,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:02,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:32:03,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 20:32:06,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:09,088 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 20:32:09,684 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=15.0 2023-09-29 20:32:12,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 20:32:12,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:32:14,132 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 20:32:14,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:32:15,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 20:32:17,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 20:32:17,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 20:32:17,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:17,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:18,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:20,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 20:32:21,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:21,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:23,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:23,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:32:25,327 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=481513.3333333333, ans=0.0 2023-09-29 20:32:25,665 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.25 vs. limit=10.0 2023-09-29 20:32:27,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=481513.3333333333, ans=0.125 2023-09-29 20:32:28,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 20:32:28,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:32:31,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:32:31,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:33,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 20:32:37,554 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:32:38,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 20:32:38,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:32:39,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:32:39,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:32:40,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:32:40,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:32:40,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:32:40,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:32:43,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 20:32:43,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:32:43,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:47,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:32:47,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:47,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 20:32:47,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:49,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 20:32:49,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:50,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 20:32:51,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 20:32:51,693 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.07 vs. limit=22.5 2023-09-29 20:32:53,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:32:53,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:55,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 20:32:56,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:32:56,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:33:01,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:33:01,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:02,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:33:09,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:33:09,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:11,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 20:33:16,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:33:16,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:33:18,108 INFO [train.py:1039] (0/4) Epoch 14, batch 3200, loss[loss=0.1839, simple_loss=0.2256, pruned_loss=0.07107, over 19169.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2618, pruned_loss=0.05813, over 4710795.82 frames. ], batch size: 389, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:33:20,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:20,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:33:20,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 20:33:24,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:33:26,519 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.865e+02 2.106e+02 2.358e+02 3.213e+02, threshold=4.213e+02, percent-clipped=0.0 2023-09-29 20:33:28,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:33:31,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:41,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:33:52,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 20:33:53,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:33:55,603 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=481846.6666666667, ans=0.0 2023-09-29 20:33:56,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 20:33:56,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:33:58,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=481846.6666666667, ans=0.125 2023-09-29 20:34:01,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:34:01,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:34:03,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:34:07,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 20:34:09,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:34:10,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 20:34:12,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 20:34:15,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:34:21,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:21,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:34:23,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:23,501 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 20:34:23,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:34:30,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:34:31,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 20:34:33,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 20:34:34,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 20:34:36,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 20:34:37,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:34:41,008 INFO [train.py:1039] (0/4) Epoch 14, batch 3250, loss[loss=0.1938, simple_loss=0.2617, pruned_loss=0.06301, over 23715.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.262, pruned_loss=0.05816, over 4717901.35 frames. ], batch size: 232, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:34:41,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:34:41,131 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 20:34:41,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:34:41,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:34:42,725 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 20:34:46,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:34:48,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:34:58,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:34:58,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 20:35:00,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:02,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:02,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:02,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:03,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:35:04,126 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=482113.3333333333, ans=0.125 2023-09-29 20:35:05,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:35:06,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:06,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:08,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:09,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:11,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:14,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:14,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:16,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:16,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:17,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:21,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 20:35:23,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:35:23,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:35:25,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:27,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:35:33,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:35:35,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=482246.6666666667, ans=0.2 2023-09-29 20:35:41,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:35:42,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:42,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 20:35:42,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:35:42,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:35:42,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:44,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 20:35:44,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 20:35:45,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:47,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:47,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:48,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:35:48,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:49,100 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=482313.3333333333, ans=0.2 2023-09-29 20:35:50,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=482313.3333333333, ans=0.2 2023-09-29 20:35:53,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:53,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:56,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 20:35:56,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:35:58,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=482313.3333333333, ans=0.2 2023-09-29 20:36:00,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:36:00,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 20:36:02,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=482380.0, ans=0.1 2023-09-29 20:36:03,604 INFO [train.py:1039] (0/4) Epoch 14, batch 3300, loss[loss=0.1782, simple_loss=0.2549, pruned_loss=0.05072, over 24613.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2637, pruned_loss=0.0588, over 4726103.74 frames. ], batch size: 60, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:36:03,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:36:03,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 20:36:05,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 20:36:07,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 20:36:07,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:11,800 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.874e+02 2.156e+02 2.504e+02 3.971e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 20:36:12,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:36:13,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:36:13,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:16,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:36:16,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:36:18,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:21,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:36:21,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=482446.6666666667, ans=0.025 2023-09-29 20:36:25,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 20:36:26,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:26,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:26,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=482446.6666666667, ans=0.125 2023-09-29 20:36:27,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:27,744 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 20:36:29,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:36:30,149 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.47 vs. limit=15.0 2023-09-29 20:36:30,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:36:30,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:36:30,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:36:30,988 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 20:36:35,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:35,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:36:37,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:37,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 20:36:38,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 20:36:40,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:40,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:36:40,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=482513.3333333333, ans=0.0 2023-09-29 20:36:43,578 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 20:36:45,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 20:36:45,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:36:46,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 20:36:49,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:36:52,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:36:54,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:36:57,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:57,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:57,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:57,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:37:00,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:37:00,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:01,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:37:03,259 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 20:37:03,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 20:37:05,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:37:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:07,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:09,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:37:09,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:11,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:37:11,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:13,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:37:14,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:16,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:37:19,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 20:37:21,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:21,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:24,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:37:24,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:37:24,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:26,184 INFO [train.py:1039] (0/4) Epoch 14, batch 3350, loss[loss=0.1758, simple_loss=0.2455, pruned_loss=0.05302, over 24346.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2642, pruned_loss=0.05879, over 4732499.70 frames. ], batch size: 56, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:37:27,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:27,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:30,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:37:33,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:35,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:37:38,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:40,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:37:42,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:42,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:37:44,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 20:37:45,617 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 20:37:47,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:49,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 20:37:49,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 20:37:49,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:37:50,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:37:50,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:53,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 20:37:53,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:53,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:37:56,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:58,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:58,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:00,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:38:03,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:06,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:06,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:11,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:38:12,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:14,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:16,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:16,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=482913.3333333333, ans=0.125 2023-09-29 20:38:17,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:19,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 20:38:19,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:38:21,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 20:38:21,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:38:23,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 20:38:23,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:25,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:32,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:33,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 20:38:33,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:38:34,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=482980.0, ans=0.0 2023-09-29 20:38:35,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:38:35,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:38:41,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:38:41,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=482980.0, ans=0.125 2023-09-29 20:38:43,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 20:38:44,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:38:44,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:38:46,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:46,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 20:38:48,229 INFO [train.py:1039] (0/4) Epoch 14, batch 3400, loss[loss=0.2734, simple_loss=0.3239, pruned_loss=0.1114, over 19910.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2646, pruned_loss=0.05886, over 4737167.38 frames. ], batch size: 388, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:38:48,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:48,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 20:38:49,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:49,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:51,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:38:53,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:38:53,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 20:38:56,129 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.936e+02 2.106e+02 2.472e+02 5.174e+02, threshold=4.212e+02, percent-clipped=2.0 2023-09-29 20:38:57,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.83 vs. limit=12.0 2023-09-29 20:38:58,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 20:38:58,496 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 20:38:58,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:02,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:39:02,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:39:03,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:05,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:39:06,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=483113.3333333333, ans=0.0 2023-09-29 20:39:12,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:14,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 20:39:18,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:39:21,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:23,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:25,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:39:30,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:39:35,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 20:39:40,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 20:39:42,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:39:43,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:43,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=483246.6666666667, ans=0.2 2023-09-29 20:39:45,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:45,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:39:46,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:49,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:39:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:39:54,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:39:54,828 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:39:54,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=483313.3333333333, ans=0.125 2023-09-29 20:39:57,389 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.26 vs. limit=15.0 2023-09-29 20:39:58,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 20:40:04,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:40:09,462 INFO [train.py:1039] (0/4) Epoch 14, batch 3450, loss[loss=0.1744, simple_loss=0.2591, pruned_loss=0.04485, over 24629.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2644, pruned_loss=0.05893, over 4736790.00 frames. ], batch size: 68, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:40:11,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 20:40:14,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 20:40:14,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:15,407 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=483380.0, ans=0.125 2023-09-29 20:40:16,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:40:16,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 20:40:17,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:40:21,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:40:24,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:40:25,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:27,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:40:27,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:30,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:37,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 20:40:40,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=483446.6666666667, ans=0.125 2023-09-29 20:40:43,410 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.56 vs. limit=10.0 2023-09-29 20:40:44,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 20:40:44,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:40:44,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:40:46,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:51,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 20:40:51,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:40:56,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:40:56,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:57,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:40:59,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:41:01,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 20:41:01,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:03,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:41:08,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:08,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=483580.0, ans=0.05 2023-09-29 20:41:11,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 20:41:14,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:41:17,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=483646.6666666667, ans=0.125 2023-09-29 20:41:18,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:41:21,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:23,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:28,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:28,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:41:29,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:41:29,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:32,573 INFO [train.py:1039] (0/4) Epoch 14, batch 3500, loss[loss=0.1784, simple_loss=0.238, pruned_loss=0.05941, over 23461.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2622, pruned_loss=0.05871, over 4718412.24 frames. ], batch size: 256, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:41:32,926 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=483713.3333333333, ans=0.0 2023-09-29 20:41:34,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:38,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:41:38,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 20:41:41,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.028e+02 2.360e+02 2.884e+02 5.509e+02, threshold=4.720e+02, percent-clipped=5.0 2023-09-29 20:41:41,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:41:44,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:41:49,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:49,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 20:41:54,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:41:54,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:56,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:41:56,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:41:56,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:41:56,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=483780.0, ans=0.2 2023-09-29 20:41:58,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:58,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:41:58,369 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=483780.0, ans=0.0 2023-09-29 20:41:59,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 20:42:00,568 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.40 vs. limit=15.0 2023-09-29 20:42:02,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:02,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:42:03,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=483780.0, ans=0.125 2023-09-29 20:42:04,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:07,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:09,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 20:42:09,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:42:13,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:15,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=483846.6666666667, ans=0.0 2023-09-29 20:42:16,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:42:18,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:19,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:42:19,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:21,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 20:42:22,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 20:42:23,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 20:42:25,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:26,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:27,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:28,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:42:30,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:42:30,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:42:31,289 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.22 vs. limit=22.5 2023-09-29 20:42:32,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=483913.3333333333, ans=0.125 2023-09-29 20:42:35,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:42:38,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 20:42:38,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 20:42:38,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:42:40,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:40,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:41,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:45,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 20:42:46,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:47,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:49,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 20:42:52,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 20:42:55,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:55,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:55,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:42:55,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:42:56,646 INFO [train.py:1039] (0/4) Epoch 14, batch 3550, loss[loss=0.1913, simple_loss=0.2815, pruned_loss=0.05053, over 24586.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2615, pruned_loss=0.05838, over 4726666.38 frames. ], batch size: 71, lr: 7.35e-03, grad_scale: 16.0 2023-09-29 20:42:58,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:43:11,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:12,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:43:14,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:16,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:43:16,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:18,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:43:18,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:43:18,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=484113.3333333333, ans=0.0 2023-09-29 20:43:22,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:23,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:43:23,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:23,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:43:25,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:43:32,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:43:32,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:34,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:34,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:36,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:43:36,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 20:43:36,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:37,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:39,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:43:46,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:46,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:46,594 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=484246.6666666667, ans=0.125 2023-09-29 20:43:47,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:48,036 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=484246.6666666667, ans=0.1 2023-09-29 20:43:49,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 20:43:51,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:43:51,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 20:43:52,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:54,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:43:54,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:43:57,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 20:43:59,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:04,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:04,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 20:44:06,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:09,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=484313.3333333333, ans=0.0 2023-09-29 20:44:10,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:44:11,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=484313.3333333333, ans=0.2 2023-09-29 20:44:12,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 20:44:19,143 INFO [train.py:1039] (0/4) Epoch 14, batch 3600, loss[loss=0.2251, simple_loss=0.2687, pruned_loss=0.09072, over 18832.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2604, pruned_loss=0.05821, over 4708181.57 frames. ], batch size: 389, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:44:19,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 20:44:19,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:44:20,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:44:22,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:24,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:24,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:44:27,543 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.842e+02 2.079e+02 2.493e+02 3.972e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-29 20:44:31,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:31,586 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=484380.0, ans=0.95 2023-09-29 20:44:32,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:34,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:44:35,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:44:37,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:37,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 20:44:40,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:44:40,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:44,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:48,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:44:48,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:44:49,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:49,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 20:44:50,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:53,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:55,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:44:57,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:59,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:45:01,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:01,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 20:45:02,181 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.40 vs. limit=10.0 2023-09-29 20:45:09,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:10,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:45:10,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 20:45:15,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:45:20,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:23,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:29,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:45:30,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:45:30,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 20:45:32,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 20:45:32,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 20:45:32,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=484646.6666666667, ans=0.2 2023-09-29 20:45:35,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:35,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:45:35,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 20:45:37,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:45:37,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:45:37,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:37,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 20:45:38,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 20:45:42,520 INFO [train.py:1039] (0/4) Epoch 14, batch 3650, loss[loss=0.1967, simple_loss=0.2643, pruned_loss=0.06459, over 23584.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.261, pruned_loss=0.05824, over 4705739.27 frames. ], batch size: 256, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:45:42,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:44,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 20:45:49,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 20:45:50,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:45:52,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=484713.3333333333, ans=0.1 2023-09-29 20:45:54,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 20:45:55,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 20:46:00,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:00,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:46:02,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:46:06,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:46:06,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:46:07,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 20:46:07,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:46:09,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:09,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 20:46:10,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:46:11,003 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=484780.0, ans=0.125 2023-09-29 20:46:12,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:12,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:13,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:46:15,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 20:46:17,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 20:46:17,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:46:19,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 20:46:21,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:21,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:46:28,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:46:30,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:30,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:46:31,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:46:33,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:46:35,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:46:39,720 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=484913.3333333333, ans=0.1 2023-09-29 20:46:41,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:42,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:42,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:42,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:46:44,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:45,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:52,696 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 20:46:54,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:54,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:56,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:46:56,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:46:58,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:46:59,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:01,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 20:47:01,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:04,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:47:05,620 INFO [train.py:1039] (0/4) Epoch 14, batch 3700, loss[loss=0.1911, simple_loss=0.2589, pruned_loss=0.06166, over 23367.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2631, pruned_loss=0.05962, over 4687261.76 frames. ], batch size: 119, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:47:05,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:47:07,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:47:07,768 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=485046.6666666667, ans=0.125 2023-09-29 20:47:08,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:08,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 20:47:09,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:11,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:47:12,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:47:13,910 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.934e+02 2.131e+02 2.298e+02 2.848e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-29 20:47:14,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:47:14,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=485046.6666666667, ans=0.0 2023-09-29 20:47:17,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:47:18,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:18,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:47:18,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:20,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:47:21,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:24,095 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 20:47:27,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=485113.3333333333, ans=0.0 2023-09-29 20:47:27,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=485113.3333333333, ans=0.125 2023-09-29 20:47:33,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:47:33,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:47:35,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:47:35,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 20:47:35,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:40,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:41,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 20:47:41,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:43,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:47:47,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:49,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:47:52,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:47:53,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:55,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 20:47:55,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:56,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 20:48:00,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:48:01,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=485246.6666666667, ans=0.125 2023-09-29 20:48:02,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:48:05,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:07,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 20:48:08,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:48:08,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:48:10,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:10,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:13,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:15,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 20:48:16,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 20:48:18,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:48:18,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:18,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:48:20,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:48:23,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:48:24,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:48:25,219 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=485313.3333333333, ans=0.0 2023-09-29 20:48:26,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:48:27,884 INFO [train.py:1039] (0/4) Epoch 14, batch 3750, loss[loss=0.1824, simple_loss=0.2666, pruned_loss=0.04913, over 24471.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2643, pruned_loss=0.05994, over 4694348.08 frames. ], batch size: 69, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:48:29,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 20:48:31,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:48:31,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=485380.0, ans=0.2 2023-09-29 20:48:34,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:48:34,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 20:48:36,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:48:37,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:38,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=485380.0, ans=0.0 2023-09-29 20:48:39,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:39,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:48:43,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:48:46,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:48:47,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:48:50,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:54,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:48:56,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 20:48:56,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:48:57,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:48:58,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:49:02,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 20:49:05,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 20:49:07,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:49:09,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:49:09,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:16,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:16,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:49:18,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=485580.0, ans=0.1 2023-09-29 20:49:21,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 20:49:23,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:27,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:49:29,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:49:33,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:49:35,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=485646.6666666667, ans=0.125 2023-09-29 20:49:37,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:49:39,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:49:41,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:49:43,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:49:46,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:49:51,112 INFO [train.py:1039] (0/4) Epoch 14, batch 3800, loss[loss=0.1991, simple_loss=0.2704, pruned_loss=0.06391, over 23398.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2641, pruned_loss=0.05987, over 4699797.29 frames. ], batch size: 93, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:49:54,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:49:57,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:59,283 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.914e+02 2.283e+02 2.549e+02 3.824e+02, threshold=4.565e+02, percent-clipped=0.0 2023-09-29 20:49:59,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:49:59,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 20:50:01,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:04,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:04,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:50:08,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 20:50:08,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:09,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:50:12,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:12,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:50:12,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:12,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 20:50:18,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:50:19,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:50:22,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:25,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:50:25,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:50:29,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:50:29,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:29,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=485846.6666666667, ans=0.0 2023-09-29 20:50:32,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:32,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:37,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:50:37,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 20:50:38,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:50:47,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:50:54,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:50:55,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 20:50:57,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 20:50:57,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:59,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=485980.0, ans=0.125 2023-09-29 20:51:00,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:51:00,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:02,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 20:51:05,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 20:51:05,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 20:51:05,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:07,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:51:13,731 INFO [train.py:1039] (0/4) Epoch 14, batch 3850, loss[loss=0.1974, simple_loss=0.2604, pruned_loss=0.06722, over 23831.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2636, pruned_loss=0.05983, over 4687086.81 frames. ], batch size: 179, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:51:15,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:51:16,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:51:16,846 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.26 vs. limit=10.0 2023-09-29 20:51:20,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:51:20,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 20:51:22,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:51:23,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:26,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:51:30,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:31,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:51:33,346 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.27 vs. limit=15.0 2023-09-29 20:51:34,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 20:51:40,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:42,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:44,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:46,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:51:50,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:50,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:51:50,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:50,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:51:52,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:54,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:55,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:55,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:51:57,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 20:51:59,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 20:51:59,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:59,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:02,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 20:52:05,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 20:52:07,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:09,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 20:52:12,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:52:18,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:20,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:23,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:23,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 20:52:27,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 20:52:29,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:29,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:32,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:52:32,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:52:34,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:35,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:35,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:52:35,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 20:52:37,421 INFO [train.py:1039] (0/4) Epoch 14, batch 3900, loss[loss=0.1838, simple_loss=0.2655, pruned_loss=0.05108, over 24355.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2634, pruned_loss=0.05889, over 4707973.99 frames. ], batch size: 77, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:52:37,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:52:37,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 20:52:39,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:39,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:39,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:52:40,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:41,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:52:42,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:42,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:42,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:52:44,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 20:52:44,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:47,046 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.944e+02 2.146e+02 2.547e+02 3.892e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 20:52:48,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:48,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:48,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:52:53,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:54,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:56,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:58,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:52:59,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 20:52:59,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:52:59,863 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=486446.6666666667, ans=0.0 2023-09-29 20:53:01,601 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:53:03,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 20:53:03,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:53:03,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 20:53:05,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 20:53:08,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:10,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:53:10,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:53:10,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:15,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:18,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:53:19,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:53:19,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:21,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:53:28,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:53:28,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:53:37,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:53:40,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:53:49,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:53:50,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=486646.6666666667, ans=0.1 2023-09-29 20:53:51,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:53,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 20:53:54,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 20:53:54,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:56,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 20:53:56,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:57,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 20:54:00,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=486713.3333333333, ans=0.0 2023-09-29 20:54:01,141 INFO [train.py:1039] (0/4) Epoch 14, batch 3950, loss[loss=0.1865, simple_loss=0.2587, pruned_loss=0.05715, over 23326.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2623, pruned_loss=0.05872, over 4703715.27 frames. ], batch size: 105, lr: 7.33e-03, grad_scale: 16.0 2023-09-29 20:54:01,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=486713.3333333333, ans=0.125 2023-09-29 20:54:05,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:54:08,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 20:54:08,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:54:09,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:54:11,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:54:17,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.whiten.whitening_limit, batch_count=486780.0, ans=15.0 2023-09-29 20:54:18,281 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 20:54:18,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:19,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 20:54:19,908 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 20:54:19,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:54:22,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:22,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:54:22,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:25,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 20:54:27,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:54:29,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:29,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:54:29,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:54:29,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:54:35,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=486846.6666666667, ans=0.0 2023-09-29 20:54:41,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:54:43,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:54:49,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 20:54:54,223 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.69 vs. limit=12.0 2023-09-29 20:54:55,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 20:54:55,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 20:54:56,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:54:58,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:55:04,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=486913.3333333333, ans=0.0 2023-09-29 20:55:05,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:55:07,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:55:07,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:09,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:55:09,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 20:55:14,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:55:16,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:55:19,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 20:55:25,252 INFO [train.py:1039] (0/4) Epoch 14, batch 4000, loss[loss=0.2051, simple_loss=0.2695, pruned_loss=0.07036, over 23614.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2632, pruned_loss=0.05916, over 4707921.45 frames. ], batch size: 232, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:55:27,260 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=487046.6666666667, ans=0.2 2023-09-29 20:55:29,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:33,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=487046.6666666667, ans=0.0 2023-09-29 20:55:34,255 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.864e+02 2.068e+02 2.328e+02 3.219e+02, threshold=4.135e+02, percent-clipped=0.0 2023-09-29 20:55:34,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=487046.6666666667, ans=0.125 2023-09-29 20:55:39,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:42,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:44,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:55:44,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:44,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 20:55:45,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:55:45,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 20:55:45,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:55:45,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 20:55:49,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:52,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:55:52,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:55:52,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:55:53,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:53,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:55:54,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=487113.3333333333, ans=0.125 2023-09-29 20:55:57,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:55:57,431 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 20:55:58,274 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.69 vs. limit=6.0 2023-09-29 20:55:59,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:55:59,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:01,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=487180.0, ans=0.1 2023-09-29 20:56:01,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=487180.0, ans=0.125 2023-09-29 20:56:02,583 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 20:56:04,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:56:04,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:09,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=487180.0, ans=0.125 2023-09-29 20:56:10,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 20:56:10,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:56:13,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:56:13,660 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 20:56:15,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:56:16,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 20:56:16,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:56:17,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:18,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:56:20,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:56:20,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:56:20,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:22,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 20:56:22,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:24,186 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 20:56:30,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:56:32,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:56:34,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:56:36,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:37,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:56:39,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:56:43,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:46,227 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.28 vs. limit=15.0 2023-09-29 20:56:46,881 INFO [train.py:1039] (0/4) Epoch 14, batch 4050, loss[loss=0.1694, simple_loss=0.2522, pruned_loss=0.04326, over 24670.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2641, pruned_loss=0.05946, over 4713834.42 frames. ], batch size: 65, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:56:46,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:56:47,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 20:56:49,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:56:51,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:56:52,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:56:54,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:56:54,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:56:58,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:56:59,532 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.88 vs. limit=22.5 2023-09-29 20:57:01,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:02,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:57:03,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:57:03,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:57:09,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:10,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:57:12,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 20:57:13,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 20:57:14,020 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 20:57:17,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:57:24,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 20:57:27,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:57:30,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:33,142 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.22 vs. limit=15.0 2023-09-29 20:57:33,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:33,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:57:33,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:37,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:39,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=487580.0, ans=0.5 2023-09-29 20:57:40,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 20:57:40,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:57:42,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:46,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 20:57:50,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:54,915 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.77 vs. limit=5.0 2023-09-29 20:57:55,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=487646.6666666667, ans=0.125 2023-09-29 20:57:59,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 20:57:59,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:01,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:58:02,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 20:58:02,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 20:58:02,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:04,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:06,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:06,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:58:10,373 INFO [train.py:1039] (0/4) Epoch 14, batch 4100, loss[loss=0.2013, simple_loss=0.2683, pruned_loss=0.0671, over 23857.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.2654, pruned_loss=0.06, over 4720300.13 frames. ], batch size: 195, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:58:14,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 20:58:15,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 20:58:18,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 20:58:19,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=487713.3333333333, ans=0.0 2023-09-29 20:58:20,528 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.945e+02 2.209e+02 2.502e+02 4.292e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 20:58:20,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 20:58:20,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:20,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:20,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:22,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:58:22,378 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 20:58:24,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:24,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=487713.3333333333, ans=0.1 2023-09-29 20:58:25,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:58:25,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:26,207 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.26 vs. limit=15.0 2023-09-29 20:58:27,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:58:28,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=487780.0, ans=0.125 2023-09-29 20:58:32,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:58:33,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:33,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:58:33,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 20:58:35,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:35,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:58:35,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:35,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=487780.0, ans=0.125 2023-09-29 20:58:37,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:58:37,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 20:58:38,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:58:40,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 20:58:42,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:43,871 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=487846.6666666667, ans=0.1 2023-09-29 20:58:45,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:45,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 20:58:45,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:47,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:58:47,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:58:50,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 20:58:52,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:58:54,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:58:55,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 20:58:55,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:57,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:58:59,917 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.55 vs. limit=15.0 2023-09-29 20:59:00,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:05,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:10,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:10,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:59:21,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:21,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:25,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:28,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:59:30,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.61 vs. limit=22.5 2023-09-29 20:59:32,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=488046.6666666667, ans=0.125 2023-09-29 20:59:33,283 INFO [train.py:1039] (0/4) Epoch 14, batch 4150, loss[loss=0.2013, simple_loss=0.2593, pruned_loss=0.07166, over 23820.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2645, pruned_loss=0.05994, over 4714020.51 frames. ], batch size: 195, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 20:59:33,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:59:34,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:59:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:59:35,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:37,319 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-09-29 20:59:38,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 20:59:38,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:40,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 20:59:40,356 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=488046.6666666667, ans=0.1 2023-09-29 20:59:41,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 20:59:41,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 20:59:43,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:48,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:59:48,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:53,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:59:54,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:59:56,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:59:57,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:59:57,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:58,354 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=488113.3333333333, ans=0.0 2023-09-29 20:59:59,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:00:04,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:05,382 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.18 vs. limit=22.5 2023-09-29 21:00:09,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:09,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 21:00:12,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 21:00:12,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:00:13,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 21:00:13,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:00:14,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:17,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:17,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:23,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 21:00:26,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:28,025 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:00:29,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:00:29,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 21:00:30,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:32,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 21:00:34,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:00:36,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=488246.6666666667, ans=0.125 2023-09-29 21:00:37,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:37,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:39,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 21:00:39,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:00:39,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:00:41,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:00:44,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 21:00:45,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:45,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:00:45,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:00:45,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 21:00:45,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:47,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 21:00:48,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:50,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:50,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 21:00:50,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:52,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=488313.3333333333, ans=0.07 2023-09-29 21:00:55,492 INFO [train.py:1039] (0/4) Epoch 14, batch 4200, loss[loss=0.1864, simple_loss=0.2492, pruned_loss=0.06181, over 23433.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2631, pruned_loss=0.05956, over 4705516.00 frames. ], batch size: 119, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:00:55,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:00:59,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 21:01:00,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:01:03,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:04,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=488380.0, ans=0.125 2023-09-29 21:01:05,340 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.014e+02 2.292e+02 2.781e+02 4.764e+02, threshold=4.585e+02, percent-clipped=1.0 2023-09-29 21:01:05,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:01:05,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:05,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:09,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 21:01:11,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 21:01:12,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:14,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:16,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=488446.6666666667, ans=0.0 2023-09-29 21:01:19,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:01:22,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:01:22,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:23,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:23,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 21:01:23,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:25,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:25,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:25,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:01:27,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:01:28,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 21:01:28,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:33,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:01:33,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=488513.3333333333, ans=0.025 2023-09-29 21:01:34,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:01:36,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=488513.3333333333, ans=0.0 2023-09-29 21:01:37,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:01:39,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:01:42,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:01:42,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 21:01:42,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:01:42,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:01:43,275 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=488513.3333333333, ans=0.125 2023-09-29 21:01:44,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=488580.0, ans=0.125 2023-09-29 21:01:48,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:01:51,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:57,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:01:59,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 21:02:02,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:03,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=488646.6666666667, ans=0.125 2023-09-29 21:02:09,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:02:09,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:12,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 21:02:17,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:02:19,031 INFO [train.py:1039] (0/4) Epoch 14, batch 4250, loss[loss=0.1733, simple_loss=0.2528, pruned_loss=0.04688, over 24688.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2626, pruned_loss=0.0589, over 4716820.98 frames. ], batch size: 65, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:02:22,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:02:22,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:02:25,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:30,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:02:32,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 21:02:32,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:02:33,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:36,162 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:02:38,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:02:41,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=488780.0, ans=15.0 2023-09-29 21:02:42,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=488780.0, ans=0.125 2023-09-29 21:02:44,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:45,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:47,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:02:47,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:02:50,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:51,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:53,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:55,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:02:57,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:58,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 21:03:00,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=488846.6666666667, ans=0.2 2023-09-29 21:03:01,189 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.89 vs. limit=10.0 2023-09-29 21:03:03,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 21:03:03,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:03,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:03,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:03:06,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:03:06,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:06,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:08,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:03:11,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:03:15,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:17,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:18,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 21:03:19,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:03:19,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 21:03:20,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:03:22,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:03:23,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:24,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:03:25,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 21:03:27,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:03:27,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:03:32,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:35,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:36,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:03:39,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:40,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:42,508 INFO [train.py:1039] (0/4) Epoch 14, batch 4300, loss[loss=0.1844, simple_loss=0.2695, pruned_loss=0.04961, over 24586.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2612, pruned_loss=0.0584, over 4714226.60 frames. ], batch size: 71, lr: 7.32e-03, grad_scale: 16.0 2023-09-29 21:03:42,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:03:42,996 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=489046.6666666667, ans=0.125 2023-09-29 21:03:44,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:03:44,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 21:03:45,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:50,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:50,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:03:53,187 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.403e+02 1.977e+02 2.365e+02 3.006e+02 5.319e+02, threshold=4.729e+02, percent-clipped=1.0 2023-09-29 21:03:54,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=489046.6666666667, ans=0.1 2023-09-29 21:03:57,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:04:00,484 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=489113.3333333333, ans=0.0 2023-09-29 21:04:04,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:04:04,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 21:04:06,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:04:09,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:04:09,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:04:09,151 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 21:04:10,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:04:12,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:15,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 21:04:15,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:04:17,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 21:04:20,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:04:22,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:04:22,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:04:23,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:04:25,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:04:27,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:29,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:04:29,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 21:04:29,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 21:04:32,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:04:34,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:34,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:04:34,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:36,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:36,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 21:04:36,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 21:04:38,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 21:04:40,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:04:40,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 21:04:40,237 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=489246.6666666667, ans=0.125 2023-09-29 21:04:41,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 21:04:44,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=489246.6666666667, ans=0.1 2023-09-29 21:04:46,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:47,706 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 21:04:47,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:04:49,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:04:49,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:51,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 21:04:52,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:52,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:53,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:04:53,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:04:54,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:04:57,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:04:58,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:00,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:01,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:05:06,038 INFO [train.py:1039] (0/4) Epoch 14, batch 4350, loss[loss=0.1992, simple_loss=0.2611, pruned_loss=0.06858, over 22845.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2626, pruned_loss=0.05925, over 4711515.28 frames. ], batch size: 322, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:05:06,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=489380.0, ans=0.0 2023-09-29 21:05:07,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 21:05:07,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:05:13,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:14,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=489380.0, ans=0.2 2023-09-29 21:05:16,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:19,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:05:19,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:05:25,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:05:27,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:30,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:05:30,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:05:35,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:05:36,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:05:38,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:05:44,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 21:05:44,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:46,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:50,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:53,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 21:05:56,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:05:56,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:05:56,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=489580.0, ans=0.0 2023-09-29 21:06:01,042 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 21:06:03,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:04,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:06:04,827 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 21:06:06,272 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 21:06:06,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:06,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:07,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:06:08,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:09,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:09,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:13,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 21:06:13,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:13,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:13,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:14,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 21:06:16,068 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 21:06:16,075 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 21:06:16,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 21:06:18,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:06:20,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:06:20,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:20,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:06:23,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 21:06:23,793 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.89 vs. limit=22.5 2023-09-29 21:06:26,160 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 21:06:26,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:27,561 INFO [train.py:1039] (0/4) Epoch 14, batch 4400, loss[loss=0.2183, simple_loss=0.2961, pruned_loss=0.07022, over 23967.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.263, pruned_loss=0.05916, over 4718586.27 frames. ], batch size: 80, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:06:29,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:29,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:32,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:34,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=489713.3333333333, ans=0.125 2023-09-29 21:06:35,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 21:06:35,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 21:06:37,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 21:06:37,336 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 21:06:37,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:06:37,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:38,935 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.905e+02 2.228e+02 2.642e+02 4.473e+02, threshold=4.456e+02, percent-clipped=0.0 2023-09-29 21:06:40,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 21:06:42,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:43,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:43,782 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 21:06:47,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:47,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 21:06:47,537 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 21:06:50,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 21:06:50,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=489780.0, ans=0.125 2023-09-29 21:06:52,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 21:06:52,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 21:06:52,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:54,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:54,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:56,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:59,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 21:06:59,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 21:06:59,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:02,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:07:02,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:04,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:05,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:05,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 21:07:07,003 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 21:07:10,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:16,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:07:19,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 21:07:23,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:07:26,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:28,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:07:30,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 21:07:30,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:07:30,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:07:30,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:07:31,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:07:37,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 21:07:39,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 21:07:40,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 21:07:40,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:40,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 21:07:42,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:07:45,198 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-09-29 21:07:45,450 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.51 vs. limit=22.5 2023-09-29 21:07:45,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:07:47,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 21:07:49,002 INFO [train.py:1039] (0/4) Epoch 14, batch 4450, loss[loss=0.1917, simple_loss=0.2762, pruned_loss=0.05361, over 24636.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2638, pruned_loss=0.05934, over 4730738.77 frames. ], batch size: 68, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:07:50,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:53,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:55,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:07:56,071 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.34 vs. limit=15.0 2023-09-29 21:08:02,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:02,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:08:05,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:07,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:08:10,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:08:12,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:12,326 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=490113.3333333333, ans=0.125 2023-09-29 21:08:13,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 21:08:13,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:13,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:13,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:08:13,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:08:16,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:08:22,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=490180.0, ans=0.125 2023-09-29 21:08:23,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:23,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:25,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:26,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:27,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:08:33,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:08:35,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 21:08:35,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 21:08:35,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:08:38,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:40,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 21:08:44,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:08:49,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:49,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 21:08:49,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:49,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:08:49,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:08:49,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:52,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:56,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:08:57,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 21:08:59,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:08:59,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:02,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:09:02,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:02,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:09:06,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:09:10,408 INFO [train.py:1039] (0/4) Epoch 14, batch 4500, loss[loss=0.1839, simple_loss=0.2425, pruned_loss=0.06267, over 23328.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2647, pruned_loss=0.05951, over 4728205.70 frames. ], batch size: 285, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:09:10,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 21:09:11,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:09:17,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:17,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 21:09:17,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 21:09:19,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:23,953 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.876e+02 2.126e+02 2.360e+02 4.104e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 21:09:24,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:24,498 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:09:25,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:25,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:09:27,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:09:27,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:27,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:35,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=490446.6666666667, ans=0.1 2023-09-29 21:09:41,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:42,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:09:45,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:45,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:09:47,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:09:53,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:09:59,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:10:00,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:10:06,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:10:06,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 21:10:07,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:07,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:07,888 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=490580.0, ans=0.0 2023-09-29 21:10:11,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:11,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:10:14,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:10:14,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 21:10:14,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:10:14,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:19,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:10:21,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:10:23,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:24,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:10:26,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:10:27,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 21:10:29,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 21:10:29,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 21:10:32,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 21:10:34,056 INFO [train.py:1039] (0/4) Epoch 14, batch 4550, loss[loss=0.1864, simple_loss=0.253, pruned_loss=0.05985, over 23566.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2638, pruned_loss=0.05913, over 4724852.78 frames. ], batch size: 106, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:10:36,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 21:10:36,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:10:39,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:41,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:43,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=490713.3333333333, ans=0.0 2023-09-29 21:10:45,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:49,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:10:52,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:52,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:10:52,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:10:52,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:55,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:57,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:11:01,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:03,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 21:11:04,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 21:11:06,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:11:07,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 21:11:09,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 21:11:11,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:13,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 21:11:15,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:11:18,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:18,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:19,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:11:21,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 21:11:25,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:27,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:27,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:29,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:31,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 21:11:32,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 21:11:32,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:11:32,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 21:11:36,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 21:11:36,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:38,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:11:38,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:39,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:39,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:11:42,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:11:42,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 21:11:43,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=490980.0, ans=0.125 2023-09-29 21:11:44,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:44,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:11:45,166 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:11:46,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 21:11:46,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:11:46,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 21:11:46,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=490980.0, ans=0.0 2023-09-29 21:11:51,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:11:51,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:11:53,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=490980.0, ans=0.0 2023-09-29 21:11:54,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:11:54,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:54,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:11:57,360 INFO [train.py:1039] (0/4) Epoch 14, batch 4600, loss[loss=0.1986, simple_loss=0.2743, pruned_loss=0.06142, over 23384.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2622, pruned_loss=0.05855, over 4712567.22 frames. ], batch size: 106, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:11:57,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:11:57,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:12:02,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:03,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:12:07,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:12:07,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:12:08,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:09,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 21:12:11,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:12:12,507 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.889e+02 2.188e+02 2.520e+02 3.712e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 21:12:15,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:12:15,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:16,780 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.46 vs. limit=22.5 2023-09-29 21:12:17,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:27,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 21:12:27,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:30,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:34,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:12:34,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:38,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 21:12:38,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:12:38,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:12:44,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:44,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:12:46,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:12:50,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 21:12:52,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:12:57,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:12:58,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:01,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:01,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 21:13:01,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:02,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 21:13:02,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:02,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:05,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:05,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:13:07,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:07,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 21:13:07,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 21:13:09,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 21:13:09,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:09,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:11,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:11,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:15,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=491313.3333333333, ans=0.2 2023-09-29 21:13:20,987 INFO [train.py:1039] (0/4) Epoch 14, batch 4650, loss[loss=0.1568, simple_loss=0.2283, pruned_loss=0.04266, over 24272.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2607, pruned_loss=0.05837, over 4698604.41 frames. ], batch size: 56, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:13:24,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:13:27,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:28,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:28,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:13:28,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:30,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:30,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:34,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 21:13:39,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:13:40,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 21:13:42,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:42,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 21:13:42,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:13:44,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 21:13:44,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 21:13:44,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:44,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:13:48,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:13:49,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:49,595 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 21:13:53,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:56,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 21:13:59,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:00,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:14:00,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 21:14:02,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:14:06,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:14:09,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:14,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:17,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:17,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:19,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:14:20,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 21:14:23,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 21:14:23,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 21:14:23,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 21:14:23,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=491580.0, ans=0.0 2023-09-29 21:14:24,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:25,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=491580.0, ans=0.1 2023-09-29 21:14:25,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=491580.0, ans=0.0 2023-09-29 21:14:31,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:14:31,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:14:31,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 21:14:31,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=491646.6666666667, ans=0.125 2023-09-29 21:14:32,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:34,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:34,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:14:36,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:14:37,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:14:37,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:39,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:43,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:44,414 INFO [train.py:1039] (0/4) Epoch 14, batch 4700, loss[loss=0.1688, simple_loss=0.2502, pruned_loss=0.04369, over 24477.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2609, pruned_loss=0.05788, over 4703573.69 frames. ], batch size: 63, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:14:44,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:14:44,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:14:44,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:14:46,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:14:46,948 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=491713.3333333333, ans=0.125 2023-09-29 21:14:48,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 21:14:51,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=491713.3333333333, ans=0.125 2023-09-29 21:14:56,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:58,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:59,753 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.446e+02 1.978e+02 2.336e+02 2.752e+02 4.215e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 21:14:59,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:00,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:02,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:15:08,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 21:15:08,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 21:15:09,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:11,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:15:11,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:15:13,357 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:15:14,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:21,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:15:23,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:15:23,740 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:15:24,260 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.34 vs. limit=15.0 2023-09-29 21:15:25,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:25,739 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.10 vs. limit=15.0 2023-09-29 21:15:31,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 21:15:32,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:15:36,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:38,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 21:15:40,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:15:43,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:15:45,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 21:15:46,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:46,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:51,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:51,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:15:51,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 21:15:54,102 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 21:15:55,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:56,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:56,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:56,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 21:15:59,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:16:02,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 21:16:05,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:16:05,526 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:16:05,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=491980.0, ans=0.2 2023-09-29 21:16:07,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=492046.6666666667, ans=0.0 2023-09-29 21:16:08,523 INFO [train.py:1039] (0/4) Epoch 14, batch 4750, loss[loss=0.1918, simple_loss=0.2698, pruned_loss=0.0569, over 23772.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2622, pruned_loss=0.05829, over 4713921.20 frames. ], batch size: 85, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:16:08,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:13,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:14,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:16:15,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 21:16:15,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:15,999 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=492046.6666666667, ans=0.125 2023-09-29 21:16:18,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 21:16:19,621 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.21 vs. limit=15.0 2023-09-29 21:16:20,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:16:21,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:16:22,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:27,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 21:16:29,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=492113.3333333333, ans=0.125 2023-09-29 21:16:32,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:16:32,864 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=492113.3333333333, ans=0.125 2023-09-29 21:16:35,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 21:16:35,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:38,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:38,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:39,462 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.97 vs. limit=22.5 2023-09-29 21:16:40,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:42,307 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 21:16:42,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 21:16:48,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 21:16:50,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:52,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:16:55,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=492180.0, ans=0.0 2023-09-29 21:16:56,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:16:56,486 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 21:16:56,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:16:58,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:17:01,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:17:04,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 21:17:04,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 21:17:04,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:17:04,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:17:04,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:06,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:17:08,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 21:17:10,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 21:17:12,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:16,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:17:16,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 21:17:18,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:19,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:17:21,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:17:23,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:23,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:17:26,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:26,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 21:17:28,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 21:17:29,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 21:17:30,935 INFO [train.py:1039] (0/4) Epoch 14, batch 4800, loss[loss=0.2074, simple_loss=0.2725, pruned_loss=0.07117, over 23423.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2633, pruned_loss=0.05913, over 4704650.11 frames. ], batch size: 285, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:17:31,283 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=492380.0, ans=0.0 2023-09-29 21:17:33,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:17:34,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:36,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 21:17:40,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:42,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:45,644 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.984e+02 2.307e+02 2.840e+02 4.511e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 21:17:47,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:17:48,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:50,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:50,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 21:17:50,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:51,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:17:53,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:17:58,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:00,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:00,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:18:02,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:02,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:18:02,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:02,532 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=492513.3333333333, ans=0.0 2023-09-29 21:18:03,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:06,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:10,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:18:13,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:18:14,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:16,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 21:18:16,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 21:18:18,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:19,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:18:19,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:18:19,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:19,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:18:21,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:18:21,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:25,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=492580.0, ans=0.0 2023-09-29 21:18:26,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:30,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:32,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:34,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=492580.0, ans=0.0 2023-09-29 21:18:37,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 21:18:38,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:38,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:38,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:18:38,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:40,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=492646.6666666667, ans=0.2 2023-09-29 21:18:40,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=492646.6666666667, ans=0.125 2023-09-29 21:18:43,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:45,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:18:45,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:46,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:18:47,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:18:47,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:18:51,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:51,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:51,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:53,059 INFO [train.py:1039] (0/4) Epoch 14, batch 4850, loss[loss=0.1905, simple_loss=0.2711, pruned_loss=0.05492, over 24003.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2637, pruned_loss=0.0597, over 4689396.61 frames. ], batch size: 80, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:18:53,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 21:18:56,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 21:18:56,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:18:56,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:00,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:19:07,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 21:19:08,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=492713.3333333333, ans=0.1 2023-09-29 21:19:10,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:11,184 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=492780.0, ans=0.125 2023-09-29 21:19:12,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=492780.0, ans=0.0 2023-09-29 21:19:13,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:15,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:19:15,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:19,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:20,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:19:22,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:19:22,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 21:19:26,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:19:28,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:19:28,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:19:30,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:19:30,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 21:19:33,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:33,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:37,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=492846.6666666667, ans=0.0 2023-09-29 21:19:38,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:38,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 21:19:40,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 21:19:40,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:19:47,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:19:48,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 21:19:50,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:19:50,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:19:53,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:19:55,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 21:19:55,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:55,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 21:19:55,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:57,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:19:58,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 21:20:00,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=492980.0, ans=0.125 2023-09-29 21:20:06,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:12,842 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.50 vs. limit=15.0 2023-09-29 21:20:13,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:20:13,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:17,052 INFO [train.py:1039] (0/4) Epoch 14, batch 4900, loss[loss=0.1895, simple_loss=0.2637, pruned_loss=0.05766, over 23314.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2626, pruned_loss=0.05939, over 4684380.47 frames. ], batch size: 105, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:20:18,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 21:20:18,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:20:23,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:25,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:25,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:20:30,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 21:20:31,616 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.872e+02 2.087e+02 2.309e+02 3.318e+02, threshold=4.174e+02, percent-clipped=0.0 2023-09-29 21:20:33,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 21:20:37,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 21:20:38,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 21:20:40,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:40,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:40,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:20:40,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:40,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:20:41,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 21:20:48,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 21:20:48,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:20:48,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:20:50,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:52,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:20:53,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:55,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:55,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 21:20:56,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:20:58,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:58,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 21:20:58,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 21:20:58,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=493180.0, ans=0.0 2023-09-29 21:21:04,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 21:21:06,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:21:07,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:07,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:21:08,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:08,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:21:09,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:21:09,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 21:21:11,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:12,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:21:14,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:21:19,451 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.61 vs. limit=15.0 2023-09-29 21:21:20,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 21:21:21,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:21:21,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:21:23,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 21:21:28,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:28,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=493313.3333333333, ans=0.125 2023-09-29 21:21:30,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=493313.3333333333, ans=0.2 2023-09-29 21:21:30,642 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.67 vs. limit=15.0 2023-09-29 21:21:31,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:21:32,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 21:21:33,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:33,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:21:36,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:39,476 INFO [train.py:1039] (0/4) Epoch 14, batch 4950, loss[loss=0.1701, simple_loss=0.2522, pruned_loss=0.04404, over 24317.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2614, pruned_loss=0.05896, over 4688221.05 frames. ], batch size: 61, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:21:39,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:21:39,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:21:39,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:39,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 21:21:42,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:21:44,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:21:45,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:49,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 21:21:49,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 21:21:49,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:21:49,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 21:21:49,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:49,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:21:51,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:21:54,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:55,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:21:55,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:21:57,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:22:00,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:00,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:22:03,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=493446.6666666667, ans=0.125 2023-09-29 21:22:05,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:22:07,268 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=493446.6666666667, ans=0.2 2023-09-29 21:22:10,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:10,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:22:12,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:13,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:15,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:22:15,680 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=493513.3333333333, ans=0.2 2023-09-29 21:22:16,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 21:22:16,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 21:22:19,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:22,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:22:22,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:22:23,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:22:23,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:22:25,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:22:26,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:29,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:22:32,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:22:34,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:35,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:36,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 21:22:36,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:22:38,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:22:41,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:22:43,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:22:43,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:22:45,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:45,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:22:46,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:22:48,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:22:49,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:22:49,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:51,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 21:22:54,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:01,140 INFO [train.py:1039] (0/4) Epoch 14, batch 5000, loss[loss=0.1785, simple_loss=0.2612, pruned_loss=0.04793, over 24657.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2614, pruned_loss=0.05843, over 4697765.67 frames. ], batch size: 68, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:23:01,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 21:23:01,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:23:06,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:06,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:08,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=493713.3333333333, ans=0.2 2023-09-29 21:23:09,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 21:23:09,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 21:23:11,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:23:14,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 21:23:14,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:23:14,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:23:16,179 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.874e+02 2.097e+02 2.409e+02 3.545e+02, threshold=4.194e+02, percent-clipped=0.0 2023-09-29 21:23:16,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 21:23:16,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:17,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:19,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 21:23:19,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:19,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:22,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 21:23:22,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 21:23:22,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:23:23,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 21:23:23,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:23:24,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:25,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:23:25,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 21:23:25,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 21:23:28,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 21:23:28,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:30,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:30,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 21:23:30,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:33,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:35,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:35,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:23:35,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 21:23:35,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:23:36,420 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.85 vs. limit=15.0 2023-09-29 21:23:38,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:23:40,799 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 21:23:46,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:46,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:46,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:23:51,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 21:23:51,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:51,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:51,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:54,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 21:23:54,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:54,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=493913.3333333333, ans=0.2 2023-09-29 21:23:57,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:57,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:23:59,465 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=493913.3333333333, ans=0.125 2023-09-29 21:24:04,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 21:24:09,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:12,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=493980.0, ans=0.04949747468305833 2023-09-29 21:24:19,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:24:20,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:20,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:24:21,610 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.93 vs. limit=15.0 2023-09-29 21:24:22,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:22,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:24:22,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:24:22,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:24,189 INFO [train.py:1039] (0/4) Epoch 14, batch 5050, loss[loss=0.199, simple_loss=0.2667, pruned_loss=0.06567, over 23371.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2623, pruned_loss=0.05861, over 4700038.31 frames. ], batch size: 105, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:24:28,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:28,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 21:24:30,931 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.05 vs. limit=22.5 2023-09-29 21:24:31,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:24:34,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:34,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:24:36,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 21:24:38,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:38,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:24:40,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:24:42,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:24:42,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:24:51,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 21:24:51,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:24:53,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:24:54,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 21:24:55,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:24:56,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:24:58,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:58,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:24:58,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 21:25:00,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 21:25:00,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=494180.0, ans=0.1 2023-09-29 21:25:02,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:03,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:03,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=494180.0, ans=0.125 2023-09-29 21:25:06,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:08,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 21:25:09,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:13,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 21:25:13,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:25:13,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=494246.6666666667, ans=0.0 2023-09-29 21:25:14,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:25:14,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:16,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:25:18,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:25:20,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:25:21,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:21,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:25:21,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:25:23,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 21:25:24,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:25:25,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=494246.6666666667, ans=0.09899494936611666 2023-09-29 21:25:26,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:25:31,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:31,401 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 21:25:31,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:25:33,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:25:33,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:33,658 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 21:25:36,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:36,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 21:25:36,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:41,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:41,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:42,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 21:25:44,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 21:25:46,464 INFO [train.py:1039] (0/4) Epoch 14, batch 5100, loss[loss=0.1632, simple_loss=0.2408, pruned_loss=0.04275, over 24442.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2632, pruned_loss=0.05904, over 4707581.81 frames. ], batch size: 63, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:25:48,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:25:48,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:25:48,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:25:51,292 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 21:25:54,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:57,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 21:25:59,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 21:25:59,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:00,834 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.801e+02 1.996e+02 2.340e+02 4.098e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-29 21:26:01,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:26:04,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:26:04,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 21:26:04,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 21:26:11,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:26:11,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:26:13,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=494446.6666666667, ans=0.1 2023-09-29 21:26:15,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:16,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=494446.6666666667, ans=0.0 2023-09-29 21:26:18,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 21:26:18,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:21,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:26:21,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:26:25,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 21:26:28,574 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 21:26:28,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:28,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 21:26:28,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 21:26:32,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:38,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=494580.0, ans=0.0 2023-09-29 21:26:40,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:26:42,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 21:26:42,529 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 21:26:42,542 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 21:26:45,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 21:26:45,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:47,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 21:26:51,049 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.83 vs. limit=6.0 2023-09-29 21:26:51,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 21:26:52,043 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=494646.6666666667, ans=0.025 2023-09-29 21:26:53,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:26:55,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:26:56,248 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=12.0 2023-09-29 21:26:58,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 21:26:58,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:27:00,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 21:27:03,244 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.96 vs. limit=15.0 2023-09-29 21:27:04,780 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.99 vs. limit=10.0 2023-09-29 21:27:05,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:27:05,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:27:05,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:27:05,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:27:05,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:27:07,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:27:08,859 INFO [train.py:1039] (0/4) Epoch 14, batch 5150, loss[loss=0.1844, simple_loss=0.2804, pruned_loss=0.04423, over 24283.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2646, pruned_loss=0.05979, over 4698142.89 frames. ], batch size: 74, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:27:08,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 21:27:08,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 21:27:10,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 21:27:10,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:27:10,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 21:27:11,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:12,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:27:12,958 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:27:15,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:15,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:20,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:27:20,693 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=494713.3333333333, ans=0.125 2023-09-29 21:27:21,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 21:27:21,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:23,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:27:25,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:27:25,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:25,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:26,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:27:26,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:27:26,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 21:27:29,362 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=494780.0, ans=0.1 2023-09-29 21:27:30,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:27:30,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:27:32,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:27:34,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 21:27:35,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:27:42,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:27:42,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=494846.6666666667, ans=0.125 2023-09-29 21:27:43,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 21:27:48,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:53,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:55,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:57,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=494913.3333333333, ans=0.2 2023-09-29 21:28:00,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:00,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:05,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 21:28:10,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:28:11,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:28:11,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:28:12,659 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.66 vs. limit=15.0 2023-09-29 21:28:14,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:16,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:18,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 21:28:23,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:28:24,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:28:28,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:28:28,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:28:29,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:28:29,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:28:29,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:28:29,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:28:31,113 INFO [train.py:1039] (0/4) Epoch 14, batch 5200, loss[loss=0.1864, simple_loss=0.2731, pruned_loss=0.04987, over 24651.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2645, pruned_loss=0.06002, over 4699664.18 frames. ], batch size: 68, lr: 7.27e-03, grad_scale: 32.0 2023-09-29 21:28:32,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:28:33,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:28:36,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:40,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 21:28:41,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:28:43,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:45,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:45,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:28:45,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:48,046 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.902e+02 2.088e+02 2.453e+02 3.691e+02, threshold=4.175e+02, percent-clipped=0.0 2023-09-29 21:28:48,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 21:28:49,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:28:51,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:54,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 21:28:57,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:28:59,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:29:01,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 21:29:01,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 21:29:04,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 21:29:04,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:04,511 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 21:29:04,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:29:06,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=495180.0, ans=0.125 2023-09-29 21:29:07,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:07,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:29:07,795 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:29:09,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 21:29:10,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:29:12,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:13,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=495180.0, ans=0.0 2023-09-29 21:29:16,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 21:29:16,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 21:29:16,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 21:29:18,151 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=495180.0, ans=0.125 2023-09-29 21:29:21,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 21:29:22,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:29:24,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=495246.6666666667, ans=0.0 2023-09-29 21:29:27,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:29:27,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:29,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 21:29:30,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:30,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:29:30,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:30,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:29:32,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=495246.6666666667, ans=0.2 2023-09-29 21:29:36,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:39,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:29:43,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:43,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:29:43,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:50,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:52,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 21:29:52,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:52,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:29:53,796 INFO [train.py:1039] (0/4) Epoch 14, batch 5250, loss[loss=0.1637, simple_loss=0.2422, pruned_loss=0.04256, over 24633.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2628, pruned_loss=0.05889, over 4714720.44 frames. ], batch size: 60, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:29:54,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:54,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:29:57,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:29:59,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:30:00,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:01,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:30:02,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:30:04,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=495380.0, ans=0.0 2023-09-29 21:30:07,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:30:10,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:30:12,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:30:15,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:30:17,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 21:30:17,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:17,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:30:20,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=495446.6666666667, ans=0.125 2023-09-29 21:30:22,056 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:30:33,359 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.38 vs. limit=12.0 2023-09-29 21:31:08,506 INFO [train.py:1039] (0/4) Epoch 14, batch 5300, loss[loss=0.1866, simple_loss=0.272, pruned_loss=0.05058, over 24658.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2631, pruned_loss=0.05881, over 4720232.54 frames. ], batch size: 68, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:31:20,222 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=495713.3333333333, ans=0.2 2023-09-29 21:31:22,581 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.904e+02 2.089e+02 2.457e+02 4.761e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-29 21:31:25,538 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-14.pt 2023-09-29 21:31:31,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:31:31,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 21:31:31,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 21:31:31,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:31,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:32,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:32,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:32,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:32,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:31:32,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:32,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:31:32,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:31:32,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 21:31:33,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 21:31:33,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 21:31:33,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:31:33,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 21:31:33,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 21:31:33,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:34,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:34,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:34,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:34,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:31:35,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:35,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:35,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:35,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:35,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:35,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:31:35,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:35,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:31:36,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 21:31:36,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:37,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:37,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 21:31:37,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 21:31:37,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:31:37,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:31:37,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 21:31:37,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 21:31:37,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:39,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:31:39,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:39,456 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 21:31:39,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 21:31:39,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:31:39,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:39,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 21:31:39,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 21:31:40,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 21:31:40,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:43,614 INFO [train.py:1039] (0/4) Epoch 15, batch 0, loss[loss=0.1949, simple_loss=0.2635, pruned_loss=0.06318, over 23583.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2635, pruned_loss=0.06318, over 23583.00 frames. ], batch size: 256, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:31:43,615 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 21:31:58,972 INFO [train.py:1071] (0/4) Epoch 15, validation: loss=0.2846, simple_loss=0.2783, pruned_loss=0.1455, over 1125622.00 frames. 2023-09-29 21:31:58,973 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-29 21:32:01,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=495800.0, ans=0.0 2023-09-29 21:32:02,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 21:32:06,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:32:07,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:32:11,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:11,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:32:11,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:12,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 21:32:14,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 21:32:17,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:18,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:22,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:23,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:23,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:32:23,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:25,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 21:32:26,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:35,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:32:35,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:39,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 21:32:44,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:32:44,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:32:45,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:50,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:32:53,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:58,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 21:33:00,808 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:33:02,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 21:33:02,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=496000.0, ans=0.0 2023-09-29 21:33:03,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:03,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:04,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:33:05,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:06,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 21:33:11,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:11,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:16,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:33:18,785 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 21:33:21,684 INFO [train.py:1039] (0/4) Epoch 15, batch 50, loss[loss=0.1887, simple_loss=0.2647, pruned_loss=0.05635, over 24456.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2649, pruned_loss=0.0576, over 1071532.35 frames. ], batch size: 63, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:33:21,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:33:24,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:26,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:26,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 21:33:27,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:33:27,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:33:29,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:32,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:32,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=496133.3333333333, ans=0.125 2023-09-29 21:33:33,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:37,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 21:33:37,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:39,623 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=496200.0, ans=0.125 2023-09-29 21:33:43,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:33:44,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 21:33:46,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 21:33:48,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:33:49,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:33:49,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:50,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:52,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:33:52,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:33:52,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:59,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:00,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:00,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:34:02,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 21:34:04,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:34:05,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:34:05,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 21:34:07,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:10,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 21:34:17,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:34:19,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:19,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:21,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:21,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:25,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 21:34:25,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 21:34:25,390 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=496333.3333333333, ans=0.2 2023-09-29 21:34:28,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:28,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:31,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:34:31,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:31,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=496400.0, ans=0.125 2023-09-29 21:34:32,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 21:34:32,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 21:34:33,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=496400.0, ans=0.125 2023-09-29 21:34:34,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:34:35,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:35,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:34:37,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 21:34:37,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 21:34:37,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:37,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:39,136 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.074e+02 2.565e+02 3.305e+02 5.603e+02, threshold=5.131e+02, percent-clipped=8.0 2023-09-29 21:34:40,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:34:40,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:34:42,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:34:44,223 INFO [train.py:1039] (0/4) Epoch 15, batch 100, loss[loss=0.193, simple_loss=0.2626, pruned_loss=0.06164, over 23523.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.265, pruned_loss=0.05839, over 1876370.68 frames. ], batch size: 120, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:34:45,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:34:49,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=496466.6666666667, ans=0.125 2023-09-29 21:34:50,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:34:53,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 21:34:53,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:57,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:34:57,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:57,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:57,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:57,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:59,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 21:35:03,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:35:03,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:03,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:03,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:35:07,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 21:35:10,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:11,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:12,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:35:14,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:35:18,014 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 21:35:19,412 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 21:35:19,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=496600.0, ans=0.1 2023-09-29 21:35:20,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:35:20,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:35:25,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:35:27,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:27,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:33,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:34,015 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 21:35:37,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:35:42,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:35:44,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:35:44,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=496666.6666666667, ans=0.0 2023-09-29 21:35:45,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:48,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:52,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:35:52,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:35:55,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:57,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:57,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:57,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:35:58,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:00,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 21:36:00,194 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 21:36:00,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:00,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:36:02,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:02,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:02,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:36:02,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:36:03,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:36:03,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:05,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:06,712 INFO [train.py:1039] (0/4) Epoch 15, batch 150, loss[loss=0.1687, simple_loss=0.2422, pruned_loss=0.04754, over 24605.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2612, pruned_loss=0.05635, over 2506302.90 frames. ], batch size: 60, lr: 7.01e-03, grad_scale: 32.0 2023-09-29 21:36:06,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:08,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:36:08,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:36:11,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:13,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:36:13,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:15,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:18,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:18,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:21,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:36:22,310 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.96 vs. limit=15.0 2023-09-29 21:36:23,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:29,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 21:36:29,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 21:36:29,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 21:36:29,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=496866.6666666667, ans=0.0 2023-09-29 21:36:32,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:36:32,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:36:32,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:36:34,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:34,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:34,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:34,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:36,096 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 21:36:39,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:40,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=496933.3333333333, ans=0.125 2023-09-29 21:36:43,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:46,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:36:48,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 21:36:52,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:36:52,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:52,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:36:54,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:36:54,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:56,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:36:57,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:57,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 21:37:01,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:04,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:04,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:37:04,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:37:08,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:09,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 21:37:12,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:37:14,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:37:15,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:19,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:37:19,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 21:37:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:37:19,426 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 21:37:23,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:25,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=497066.6666666667, ans=0.1 2023-09-29 21:37:26,184 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.806e+02 2.055e+02 2.590e+02 4.271e+02, threshold=4.110e+02, percent-clipped=0.0 2023-09-29 21:37:26,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:37:26,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:37:29,048 INFO [train.py:1039] (0/4) Epoch 15, batch 200, loss[loss=0.1845, simple_loss=0.2673, pruned_loss=0.05084, over 24651.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2636, pruned_loss=0.05781, over 3003575.82 frames. ], batch size: 68, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:37:29,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 21:37:30,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:30,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:34,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 21:37:36,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:37:39,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:39,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:44,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:37:44,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:44,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=497200.0, ans=0.125 2023-09-29 21:37:45,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:52,890 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.09 vs. limit=22.5 2023-09-29 21:38:00,041 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=497200.0, ans=0.125 2023-09-29 21:38:08,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:38:08,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=497266.6666666667, ans=0.1 2023-09-29 21:38:10,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:38:10,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:38:10,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:38:12,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 21:38:12,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:38:15,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:17,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:38:18,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:20,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:21,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 21:38:21,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:38:21,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:23,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=497333.3333333333, ans=0.125 2023-09-29 21:38:25,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:38:25,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=497333.3333333333, ans=0.125 2023-09-29 21:38:25,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=497333.3333333333, ans=0.125 2023-09-29 21:38:30,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:38,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:39,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:38:45,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=497400.0, ans=0.0 2023-09-29 21:38:47,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:48,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 21:38:50,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:50,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:38:50,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:51,884 INFO [train.py:1039] (0/4) Epoch 15, batch 250, loss[loss=0.192, simple_loss=0.2809, pruned_loss=0.05156, over 24648.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2629, pruned_loss=0.05729, over 3380102.90 frames. ], batch size: 73, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:38:51,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:38:53,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 21:38:54,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:38:54,926 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 21:38:56,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:58,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:39:02,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:02,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:39:03,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:39:03,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:05,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:39:09,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:39:20,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:24,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:39:25,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:39:31,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:39:31,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:39:33,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:39:35,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:35,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:39:35,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:39:35,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:38,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:39:41,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 21:39:43,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:44,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:39:45,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:39:45,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:39:45,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:39:47,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:39:47,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:39:48,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:39:51,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:39:51,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:39:56,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:40:00,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:00,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=497733.3333333333, ans=0.1 2023-09-29 21:40:02,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:40:06,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:06,338 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=497733.3333333333, ans=0.07 2023-09-29 21:40:08,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:40:12,384 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.826e+02 2.078e+02 2.374e+02 4.039e+02, threshold=4.156e+02, percent-clipped=0.0 2023-09-29 21:40:12,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 21:40:14,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:40:14,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:40:16,024 INFO [train.py:1039] (0/4) Epoch 15, batch 300, loss[loss=0.1979, simple_loss=0.2762, pruned_loss=0.0598, over 24377.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2606, pruned_loss=0.0565, over 3675114.81 frames. ], batch size: 77, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:40:17,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 21:40:17,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:40:19,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:40:19,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 21:40:24,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:25,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:40:31,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:40:31,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 21:40:31,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:32,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:40:34,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 21:40:34,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:37,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:40:42,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:40:42,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=497866.6666666667, ans=0.1 2023-09-29 21:40:44,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 21:40:44,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=497866.6666666667, ans=0.125 2023-09-29 21:40:48,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 21:40:48,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:52,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:55,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:55,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 21:40:55,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:40:55,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:40:57,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=497933.3333333333, ans=0.0 2023-09-29 21:40:58,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:40:58,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:04,086 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.30 vs. limit=10.0 2023-09-29 21:41:05,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:41:05,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 21:41:05,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:41:08,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:10,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 21:41:11,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:15,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:41:17,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:41:17,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 21:41:21,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:21,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:41:23,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:25,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:41:26,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 21:41:26,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:41:28,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:29,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 21:41:32,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:32,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:32,885 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=498066.6666666667, ans=0.125 2023-09-29 21:41:34,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:34,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:35,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:37,674 INFO [train.py:1039] (0/4) Epoch 15, batch 350, loss[loss=0.1938, simple_loss=0.2729, pruned_loss=0.05737, over 16532.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2595, pruned_loss=0.05612, over 3893718.74 frames. ], batch size: 35, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:41:40,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:40,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:41:44,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:49,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:51,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=498133.3333333333, ans=0.125 2023-09-29 21:41:52,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:54,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:57,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 21:41:59,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:59,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 21:42:01,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:02,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 21:42:02,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:06,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 21:42:07,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=498200.0, ans=0.1 2023-09-29 21:42:08,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:42:10,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:10,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:42:12,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:13,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:13,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:42:15,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:42:17,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:22,864 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=498266.6666666667, ans=0.125 2023-09-29 21:42:24,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:42:24,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:42:25,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:42:27,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:31,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 21:42:31,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:37,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:37,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:37,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:42:40,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 21:42:40,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:40,740 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=498333.3333333333, ans=0.07 2023-09-29 21:42:41,509 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.07 vs. limit=15.0 2023-09-29 21:42:42,088 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 21:42:43,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 21:42:43,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:45,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:45,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 21:42:49,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:50,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:42:51,257 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=498400.0, ans=0.0 2023-09-29 21:42:52,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:54,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:54,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:56,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:57,504 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.856e+02 2.198e+02 2.696e+02 4.798e+02, threshold=4.395e+02, percent-clipped=2.0 2023-09-29 21:42:58,086 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=498400.0, ans=0.0 2023-09-29 21:42:59,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:43:00,683 INFO [train.py:1039] (0/4) Epoch 15, batch 400, loss[loss=0.1906, simple_loss=0.2563, pruned_loss=0.06245, over 23581.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2583, pruned_loss=0.05588, over 4070377.00 frames. ], batch size: 120, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:43:00,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:43:02,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 21:43:02,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:03,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:04,096 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=498466.6666666667, ans=0.125 2023-09-29 21:43:05,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:43:07,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:09,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=498466.6666666667, ans=0.2 2023-09-29 21:43:10,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:12,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:13,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 21:43:13,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 21:43:13,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:15,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 21:43:16,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:20,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:43:20,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:20,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 21:43:20,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:43:22,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:22,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:22,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:27,348 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 21:43:27,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 21:43:32,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:33,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:33,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=498600.0, ans=0.125 2023-09-29 21:43:35,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 21:43:36,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 21:43:39,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:43:42,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:43:43,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=498600.0, ans=0.0 2023-09-29 21:43:45,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=498600.0, ans=0.0 2023-09-29 21:43:48,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 21:43:51,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=498666.6666666667, ans=0.05 2023-09-29 21:43:52,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:43:54,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 21:43:58,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:44:01,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:44:02,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 21:44:04,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:44:07,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:44:08,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:44:13,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:13,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 21:44:15,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:44:16,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 21:44:18,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:44:18,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:44:20,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 21:44:21,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:44:23,338 INFO [train.py:1039] (0/4) Epoch 15, batch 450, loss[loss=0.1896, simple_loss=0.2768, pruned_loss=0.05118, over 24395.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2596, pruned_loss=0.05678, over 4208289.85 frames. ], batch size: 69, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:44:23,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:44:23,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:44:25,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 21:44:25,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:44:26,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:44:28,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:44:28,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 21:44:30,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:44:32,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:44:33,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:44:43,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:45,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:44:45,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 21:44:46,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 21:44:53,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:44:53,726 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.22 vs. limit=12.0 2023-09-29 21:44:54,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:54,983 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:44:57,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:00,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:00,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:02,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=498933.3333333333, ans=0.125 2023-09-29 21:45:03,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 21:45:05,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 21:45:07,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 21:45:07,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:09,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:09,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:45:11,135 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 21:45:11,151 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 21:45:11,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:45:13,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:45:13,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:45:18,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:45:18,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:45:19,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:45:19,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 21:45:22,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:24,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:45:24,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:45:26,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 21:45:26,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=499000.0, ans=0.2 2023-09-29 21:45:29,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:45:31,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 21:45:31,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 21:45:32,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:39,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:45:40,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:45:42,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:45:42,975 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 21:45:44,926 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.892e+02 2.180e+02 2.454e+02 3.588e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 21:45:46,457 INFO [train.py:1039] (0/4) Epoch 15, batch 500, loss[loss=0.2041, simple_loss=0.2692, pruned_loss=0.06955, over 22791.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2608, pruned_loss=0.0572, over 4320710.25 frames. ], batch size: 322, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:45:48,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:49,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:45:51,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:51,103 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 21:45:52,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 21:45:52,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:54,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:45:59,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:46:01,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:46:02,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:46:02,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:46:04,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:08,848 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:46:16,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:18,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:46:18,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:46:20,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:20,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 21:46:20,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:46:24,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:46:24,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=499266.6666666667, ans=0.0 2023-09-29 21:46:25,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:46:25,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:46:25,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:26,323 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.34 vs. limit=15.0 2023-09-29 21:46:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 21:46:30,134 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 21:46:31,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:46:33,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:37,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:46:38,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 21:46:41,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:46:41,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:46:42,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=499333.3333333333, ans=0.0 2023-09-29 21:46:46,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:46:50,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:56,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:47:01,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 21:47:01,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:01,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:47:04,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 21:47:04,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:47:06,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:07,616 INFO [train.py:1039] (0/4) Epoch 15, batch 550, loss[loss=0.194, simple_loss=0.2627, pruned_loss=0.06268, over 23289.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2628, pruned_loss=0.05899, over 4399309.01 frames. ], batch size: 119, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:47:09,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 21:47:10,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 21:47:12,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:13,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 21:47:14,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:47:14,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:16,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:47:17,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:47:19,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:20,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=499466.6666666667, ans=0.1 2023-09-29 21:47:22,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 21:47:22,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:47:28,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:30,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:33,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:47:33,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:34,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=499533.3333333333, ans=0.1 2023-09-29 21:47:37,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 21:47:39,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 21:47:41,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:47:43,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=499600.0, ans=0.125 2023-09-29 21:47:44,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:47:44,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:46,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:47:46,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=499600.0, ans=0.0 2023-09-29 21:47:51,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:51,023 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 21:47:52,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:52,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:47:55,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:57,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:47:57,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:47:59,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:59,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 21:48:02,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 21:48:04,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:04,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:48:04,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=499666.6666666667, ans=0.0 2023-09-29 21:48:06,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:06,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:48:06,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=499666.6666666667, ans=0.125 2023-09-29 21:48:08,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:48:08,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=499666.6666666667, ans=0.0 2023-09-29 21:48:09,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:48:12,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:48:13,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:14,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:48:16,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:48:18,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:19,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:48:19,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:21,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:48:21,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:48:27,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 21:48:28,880 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.896e+02 2.048e+02 2.393e+02 3.212e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-29 21:48:30,438 INFO [train.py:1039] (0/4) Epoch 15, batch 600, loss[loss=0.1656, simple_loss=0.2383, pruned_loss=0.04648, over 24451.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2627, pruned_loss=0.05847, over 4481871.34 frames. ], batch size: 58, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:48:31,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 21:48:33,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:48:33,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:48:34,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:36,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=499800.0, ans=0.0 2023-09-29 21:48:41,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:48:41,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:48:42,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 21:48:43,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=499800.0, ans=0.1 2023-09-29 21:48:45,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:48:47,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:48:49,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:51,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=499866.6666666667, ans=0.125 2023-09-29 21:48:52,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 21:48:52,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:58,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 21:49:01,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:49:01,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:03,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:49:04,950 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.36 vs. limit=15.0 2023-09-29 21:49:09,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:49:09,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:49:09,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:17,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:49:17,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=499933.3333333333, ans=0.1 2023-09-29 21:49:22,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:22,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:49:22,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:24,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=500000.0, ans=0.0 2023-09-29 21:49:30,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 21:49:32,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=500000.0, ans=0.5 2023-09-29 21:49:38,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:49:38,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:49:43,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 21:49:45,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:49:49,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 21:49:49,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:49:50,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:49:51,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=500066.6666666667, ans=0.125 2023-09-29 21:49:53,637 INFO [train.py:1039] (0/4) Epoch 15, batch 650, loss[loss=0.1945, simple_loss=0.2812, pruned_loss=0.05391, over 24655.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2622, pruned_loss=0.05859, over 4534382.29 frames. ], batch size: 73, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:49:53,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:49:54,733 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-09-29 21:49:55,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:49:57,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:49:58,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=500133.3333333333, ans=0.0 2023-09-29 21:49:59,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:50:00,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:04,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 21:50:04,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=500133.3333333333, ans=0.2 2023-09-29 21:50:05,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:50:10,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:50:10,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:14,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:19,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 21:50:20,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:20,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:25,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:50:25,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 21:50:28,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:30,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:32,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:50:32,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:33,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:50:35,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:50:35,506 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 21:50:35,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:35,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:39,183 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.12 vs. limit=15.0 2023-09-29 21:50:40,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:40,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:41,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:50:43,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:50:44,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 21:50:44,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:50:44,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:50:46,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:50:46,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:47,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:50:50,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 21:50:52,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 21:50:52,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:52,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:52,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:50:53,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:55,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:51:00,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:00,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:00,448 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=500400.0, ans=0.2 2023-09-29 21:51:01,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:51:05,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:05,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 21:51:05,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:07,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=500400.0, ans=0.0 2023-09-29 21:51:12,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:51:12,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:14,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:14,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:15,839 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.967e+02 2.230e+02 2.701e+02 4.378e+02, threshold=4.460e+02, percent-clipped=5.0 2023-09-29 21:51:15,881 INFO [train.py:1039] (0/4) Epoch 15, batch 700, loss[loss=0.1804, simple_loss=0.2692, pruned_loss=0.04581, over 24445.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2609, pruned_loss=0.05792, over 4565692.70 frames. ], batch size: 69, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:51:20,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 21:51:20,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 21:51:22,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 21:51:25,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:28,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:51:29,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 21:51:32,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:35,074 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.61 vs. limit=15.0 2023-09-29 21:51:36,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:51:36,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:37,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=500533.3333333333, ans=0.125 2023-09-29 21:51:39,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:51:39,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:42,239 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.39 vs. limit=22.5 2023-09-29 21:51:42,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:45,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 21:51:45,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:51:47,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 21:51:49,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 21:51:53,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:51:53,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:51:55,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:52:03,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:52:03,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 21:52:06,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=500666.6666666667, ans=0.125 2023-09-29 21:52:07,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:09,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:52:09,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 21:52:14,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:52:15,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:18,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:52:25,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:52:25,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 21:52:25,486 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=500733.3333333333, ans=0.2 2023-09-29 21:52:26,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 21:52:27,565 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.38 vs. limit=15.0 2023-09-29 21:52:28,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 21:52:28,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=500733.3333333333, ans=0.0 2023-09-29 21:52:30,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:32,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:32,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:52:35,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:35,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 21:52:39,403 INFO [train.py:1039] (0/4) Epoch 15, batch 750, loss[loss=0.1999, simple_loss=0.2639, pruned_loss=0.06796, over 23935.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2605, pruned_loss=0.05733, over 4606975.76 frames. ], batch size: 195, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:52:41,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 21:52:41,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 21:52:41,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 21:52:41,781 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.31 vs. limit=22.5 2023-09-29 21:52:42,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 21:52:42,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 21:52:44,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:52:46,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 21:52:46,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:47,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:52:48,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:49,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:50,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:52:51,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:52,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=500800.0, ans=0.0 2023-09-29 21:52:54,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:52:54,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:52:57,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:52:57,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=500866.6666666667, ans=0.5 2023-09-29 21:52:58,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:59,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:59,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=500866.6666666667, ans=0.125 2023-09-29 21:53:00,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 21:53:01,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:53:03,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:53:06,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 21:53:07,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:08,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=500866.6666666667, ans=0.0 2023-09-29 21:53:10,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 21:53:10,102 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 21:53:11,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 21:53:11,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:53:11,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:53:14,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:53:21,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:53:22,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:22,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:53:24,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:53:26,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:53:26,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 21:53:27,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:53:29,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:53:29,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:53:32,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:53:32,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 21:53:34,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:39,771 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=501000.0, ans=0.0 2023-09-29 21:53:41,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:53:41,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:53:42,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:53:44,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:53:49,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 21:53:49,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:53:49,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:54,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:54,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:54,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=501066.6666666667, ans=0.125 2023-09-29 21:53:57,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:57,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:54:02,040 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.875e+02 2.035e+02 2.283e+02 3.726e+02, threshold=4.071e+02, percent-clipped=0.0 2023-09-29 21:54:02,084 INFO [train.py:1039] (0/4) Epoch 15, batch 800, loss[loss=0.1878, simple_loss=0.2766, pruned_loss=0.04949, over 24434.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2614, pruned_loss=0.05749, over 4634217.64 frames. ], batch size: 69, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:54:03,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:03,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:04,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=501133.3333333333, ans=0.125 2023-09-29 21:54:06,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=501133.3333333333, ans=0.0 2023-09-29 21:54:07,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:54:07,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:09,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:09,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:11,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:15,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:15,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:54:18,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 21:54:20,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:21,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:21,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:54:22,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:23,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 21:54:23,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:25,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 21:54:28,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:31,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=501200.0, ans=0.125 2023-09-29 21:54:32,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:35,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:54:35,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:38,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:38,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:42,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:54:43,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:54:43,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:54:47,191 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 21:54:47,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 21:54:47,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:54:47,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:48,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:48,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:54:55,131 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 21:54:55,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 21:54:58,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:54:59,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:55:04,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:55:08,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=501400.0, ans=0.0 2023-09-29 21:55:09,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:10,153 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=501400.0, ans=0.07 2023-09-29 21:55:11,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 21:55:11,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:55:14,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 21:55:16,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=501400.0, ans=0.2 2023-09-29 21:55:21,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:24,935 INFO [train.py:1039] (0/4) Epoch 15, batch 850, loss[loss=0.1926, simple_loss=0.2616, pruned_loss=0.06178, over 23844.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.262, pruned_loss=0.05762, over 4664544.45 frames. ], batch size: 195, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:55:25,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:55:25,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 21:55:25,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:55:26,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:28,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 21:55:29,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:31,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:55:32,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:34,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:55:35,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:55:36,691 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.35 vs. limit=22.5 2023-09-29 21:55:37,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 21:55:38,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 21:55:39,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 21:55:40,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:41,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:55:42,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:42,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:42,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=501533.3333333333, ans=0.0 2023-09-29 21:55:44,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:55:50,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:50,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:51,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 21:55:55,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 21:55:58,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:56:01,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 21:56:05,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 21:56:07,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 21:56:08,854 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 21:56:08,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:08,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:56:08,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:56:11,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:12,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:13,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 21:56:16,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:17,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:18,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:56:18,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:56:20,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:56:22,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:56:24,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 21:56:27,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:56:28,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:28,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:56:28,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:30,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:32,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=501733.3333333333, ans=0.125 2023-09-29 21:56:34,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:36,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:56:37,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:56:39,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:56:39,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:56:42,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=501733.3333333333, ans=0.0 2023-09-29 21:56:44,526 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.02 vs. limit=6.0 2023-09-29 21:56:46,689 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.806e+02 2.003e+02 2.266e+02 2.717e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-29 21:56:46,734 INFO [train.py:1039] (0/4) Epoch 15, batch 900, loss[loss=0.1844, simple_loss=0.2621, pruned_loss=0.05332, over 24620.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2632, pruned_loss=0.0579, over 4681998.46 frames. ], batch size: 65, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:56:48,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:56:50,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:50,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 21:56:51,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:56:51,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:53,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 21:56:55,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=501800.0, ans=0.0 2023-09-29 21:57:00,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:57:03,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:03,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 21:57:07,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:57:07,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 21:57:09,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:57:09,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:57:09,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:10,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:57:10,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:57:22,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:22,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:22,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:57:25,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:28,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 21:57:31,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:57:33,768 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=501933.3333333333, ans=0.125 2023-09-29 21:57:36,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:57:36,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:57:38,377 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 21:57:38,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 21:57:45,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:57:46,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:57:46,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:57:52,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:53,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:57:54,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 21:57:54,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:56,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 21:57:58,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:57:58,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:01,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:01,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:06,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 21:58:06,854 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 21:58:08,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:58:08,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 21:58:09,934 INFO [train.py:1039] (0/4) Epoch 15, batch 950, loss[loss=0.1823, simple_loss=0.2431, pruned_loss=0.0607, over 23527.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.264, pruned_loss=0.05855, over 4673231.76 frames. ], batch size: 256, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:58:12,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:15,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 21:58:21,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:23,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:23,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:24,261 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.28 vs. limit=15.0 2023-09-29 21:58:25,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:58:28,136 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 21:58:30,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:30,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:30,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:30,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:58:31,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 21:58:33,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:58:35,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:36,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 21:58:36,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:42,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=502266.6666666667, ans=0.0 2023-09-29 21:58:43,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:43,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:43,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:45,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 21:58:49,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:58:50,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:52,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:58:58,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:58,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:59:01,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 21:59:02,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 21:59:02,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:59:05,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:05,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:05,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:59:09,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 21:59:12,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:59:16,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:16,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:16,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 21:59:16,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:16,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:59:18,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 21:59:23,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:59:24,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:28,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:29,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 21:59:29,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 21:59:32,763 INFO [train.py:1039] (0/4) Epoch 15, batch 1000, loss[loss=0.1758, simple_loss=0.2476, pruned_loss=0.05201, over 24596.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2632, pruned_loss=0.05851, over 4683600.81 frames. ], batch size: 60, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 21:59:34,242 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.874e+02 2.213e+02 2.619e+02 3.676e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 21:59:34,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:37,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 21:59:38,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:59:45,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:59:47,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 21:59:47,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 21:59:49,736 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=502533.3333333333, ans=0.125 2023-09-29 21:59:50,361 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.43 vs. limit=6.0 2023-09-29 21:59:53,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:59:53,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:54,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:58,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 22:00:00,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 22:00:01,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 22:00:01,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:04,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 22:00:06,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:00:06,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 22:00:06,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=502600.0, ans=0.2 2023-09-29 22:00:07,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:08,528 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.82 vs. limit=6.0 2023-09-29 22:00:09,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:16,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=502600.0, ans=0.125 2023-09-29 22:00:17,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:17,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:00:17,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:20,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:20,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 22:00:20,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:20,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:00:22,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:22,212 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 22:00:25,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 22:00:27,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 22:00:30,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 22:00:32,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:00:35,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=502666.6666666667, ans=0.125 2023-09-29 22:00:39,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:39,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:00:39,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:42,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:00:42,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 22:00:44,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:00:45,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 22:00:45,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 22:00:47,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:00:47,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:51,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:00:51,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=502733.3333333333, ans=0.0 2023-09-29 22:00:54,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:00:56,320 INFO [train.py:1039] (0/4) Epoch 15, batch 1050, loss[loss=0.1894, simple_loss=0.2532, pruned_loss=0.06281, over 23314.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2611, pruned_loss=0.05758, over 4683150.13 frames. ], batch size: 105, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:00:56,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:01:00,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:01:00,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=502800.0, ans=0.0 2023-09-29 22:01:01,015 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.57 vs. limit=15.0 2023-09-29 22:01:01,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:01:03,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:01:04,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:07,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:10,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:01:12,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:01:14,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:01:15,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:01:15,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:01:17,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:01:17,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 22:01:18,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:18,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 22:01:20,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:01:20,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 22:01:20,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:01:22,509 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=502866.6666666667, ans=0.0 2023-09-29 22:01:27,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:29,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:01:29,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:32,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 22:01:34,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 22:01:34,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:36,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 22:01:38,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 22:01:39,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:01:44,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:01:47,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:01:47,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:01:48,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:01:51,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:01:55,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 22:01:56,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 22:01:56,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 22:01:56,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:01:56,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:02:00,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 22:02:05,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:02:07,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:02:07,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:08,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:08,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:13,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:13,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 22:02:14,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:15,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 22:02:15,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 22:02:16,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:02:17,880 INFO [train.py:1039] (0/4) Epoch 15, batch 1100, loss[loss=0.1958, simple_loss=0.2658, pruned_loss=0.06292, over 23653.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2607, pruned_loss=0.05694, over 4700813.63 frames. ], batch size: 256, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:02:19,328 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.797e+02 2.092e+02 2.502e+02 4.130e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 22:02:19,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:02:25,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:02:30,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:02:31,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:02:31,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:33,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 22:02:35,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:02:37,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:02:39,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:02:44,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:02:44,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 22:02:47,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:02:47,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:47,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:50,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:02:52,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:02:52,964 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=503266.6666666667, ans=0.2 2023-09-29 22:02:57,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:02:57,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=503266.6666666667, ans=0.125 2023-09-29 22:02:57,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=503266.6666666667, ans=0.125 2023-09-29 22:02:59,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=503266.6666666667, ans=0.1 2023-09-29 22:03:00,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 22:03:01,743 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 22:03:01,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:02,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=503266.6666666667, ans=0.2 2023-09-29 22:03:04,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:05,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:03:06,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:03:08,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 22:03:08,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:03:10,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:03:10,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:03:10,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:10,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 22:03:16,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:03:16,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 22:03:20,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:03:24,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:03:26,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 22:03:26,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:03:28,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:31,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=503400.0, ans=0.0 2023-09-29 22:03:32,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:32,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:33,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 22:03:34,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:03:35,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:37,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 22:03:37,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:03:37,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 22:03:38,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:03:38,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:03:40,053 INFO [train.py:1039] (0/4) Epoch 15, batch 1150, loss[loss=0.1965, simple_loss=0.2793, pruned_loss=0.05692, over 24307.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.262, pruned_loss=0.05728, over 4709426.05 frames. ], batch size: 74, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:03:40,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:03:43,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:48,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:03:50,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:50,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:03:51,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 22:03:52,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:03:56,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 22:03:57,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:57,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:04:05,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 22:04:06,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:09,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:04:10,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:10,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 22:04:10,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:04:10,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:04:13,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 22:04:14,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:15,570 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.29 vs. limit=15.0 2023-09-29 22:04:16,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:04:26,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=503600.0, ans=0.04949747468305833 2023-09-29 22:04:29,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 22:04:36,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:36,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:43,221 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 22:04:44,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:53,103 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 22:04:59,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:04:59,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:05:01,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:05:01,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:05:03,926 INFO [train.py:1039] (0/4) Epoch 15, batch 1200, loss[loss=0.2075, simple_loss=0.2669, pruned_loss=0.07398, over 23818.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2626, pruned_loss=0.05781, over 4710056.16 frames. ], batch size: 164, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:05:04,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=503800.0, ans=0.05 2023-09-29 22:05:05,383 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.402e+02 1.796e+02 2.044e+02 2.374e+02 3.909e+02, threshold=4.087e+02, percent-clipped=0.0 2023-09-29 22:05:05,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:11,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:05:11,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:05:14,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:14,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:14,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:05:16,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:05:18,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:05:19,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:21,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:22,702 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 22:05:23,153 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=503866.6666666667, ans=0.0 2023-09-29 22:05:24,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 22:05:27,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:05:30,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:05:33,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:36,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:05:36,411 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 22:05:38,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:44,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:05:44,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:05:46,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 22:05:46,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:05:49,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 22:05:55,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 22:05:55,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:57,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:58,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:06:00,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:06:01,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:06:01,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:06:03,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:06:03,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 22:06:05,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:06:05,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:05,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:06:06,223 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.94 vs. limit=22.5 2023-09-29 22:06:07,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:07,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:06:13,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:06:14,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:06:17,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 22:06:21,956 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 22:06:23,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:24,971 INFO [train.py:1039] (0/4) Epoch 15, batch 1250, loss[loss=0.1788, simple_loss=0.2573, pruned_loss=0.05011, over 24644.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.263, pruned_loss=0.05776, over 4719165.19 frames. ], batch size: 65, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:06:26,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:29,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:06:29,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:34,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 22:06:37,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:06:39,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:40,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 22:06:41,904 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.72 vs. limit=15.0 2023-09-29 22:06:42,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:06:42,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:06:45,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=504200.0, ans=0.125 2023-09-29 22:06:48,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:06:49,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:51,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:06:51,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:52,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:06:56,287 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=504266.6666666667, ans=0.125 2023-09-29 22:06:57,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:06:57,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:06:57,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:59,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:59,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:02,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:02,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=504266.6666666667, ans=0.0 2023-09-29 22:07:03,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:07:08,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 22:07:08,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:07:11,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:11,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 22:07:13,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:07:13,235 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 22:07:13,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:13,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:19,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:22,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:23,001 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.21 vs. limit=12.0 2023-09-29 22:07:23,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:07:25,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 22:07:25,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 22:07:25,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 22:07:28,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:30,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 22:07:30,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:33,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:07:33,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:07:34,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 22:07:34,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:07:36,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:07:36,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:07:37,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:37,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 22:07:40,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:42,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:07:42,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:07:46,009 INFO [train.py:1039] (0/4) Epoch 15, batch 1300, loss[loss=0.2487, simple_loss=0.3083, pruned_loss=0.09452, over 19646.00 frames. ], tot_loss[loss=0.1905, simple_loss=0.2638, pruned_loss=0.0586, over 4703126.37 frames. ], batch size: 389, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:07:46,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:07:48,103 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.942e+02 2.297e+02 2.853e+02 4.160e+02, threshold=4.593e+02, percent-clipped=1.0 2023-09-29 22:07:50,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:50,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 22:07:55,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:58,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:07:59,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:00,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:08:00,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:08:00,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=504466.6666666667, ans=0.2 2023-09-29 22:08:01,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 22:08:07,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:08:09,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:08:10,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 22:08:12,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:08:17,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:19,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:20,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:08:22,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:22,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:08:23,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:08:23,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 22:08:27,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=504600.0, ans=0.0 2023-09-29 22:08:30,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:08:30,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:08:31,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 22:08:32,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:08:32,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=504600.0, ans=0.125 2023-09-29 22:08:33,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:08:35,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=504666.6666666667, ans=0.125 2023-09-29 22:08:38,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:38,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 22:08:38,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:38,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 22:08:39,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:42,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:42,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:08:47,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 22:08:49,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 22:08:51,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 22:08:56,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:08:58,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 22:09:01,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:07,785 INFO [train.py:1039] (0/4) Epoch 15, batch 1350, loss[loss=0.2018, simple_loss=0.2694, pruned_loss=0.06715, over 23465.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2624, pruned_loss=0.05871, over 4701609.91 frames. ], batch size: 105, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:09:07,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 22:09:09,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:12,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:15,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:17,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:18,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:09:18,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:22,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=504866.6666666667, ans=0.1 2023-09-29 22:09:25,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:26,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 22:09:27,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:09:29,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:09:33,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 22:09:33,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:09:35,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:09:35,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 22:09:36,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 22:09:38,457 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=504866.6666666667, ans=0.125 2023-09-29 22:09:39,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 22:09:41,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:41,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 22:09:41,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=504933.3333333333, ans=0.0 2023-09-29 22:09:53,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:03,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:03,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:03,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 22:10:08,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:10,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 22:10:10,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:10:11,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:10:14,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:10:16,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 22:10:17,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:10:22,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 22:10:24,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 22:10:29,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 22:10:29,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:31,216 INFO [train.py:1039] (0/4) Epoch 15, batch 1400, loss[loss=0.1761, simple_loss=0.2446, pruned_loss=0.05375, over 23613.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2605, pruned_loss=0.05802, over 4709859.72 frames. ], batch size: 135, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:10:33,171 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.906e+02 2.114e+02 2.329e+02 4.269e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-29 22:10:34,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:10:34,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:10:43,035 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.87 vs. limit=15.0 2023-09-29 22:10:43,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 22:10:45,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 22:10:53,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:10:55,407 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=505200.0, ans=0.0 2023-09-29 22:10:56,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:10:59,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:10:59,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:11:02,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:11:02,932 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=505266.6666666667, ans=0.1 2023-09-29 22:11:05,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 22:11:14,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:15,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:20,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 22:11:22,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:11:23,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:11:25,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:11:25,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:11:26,757 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:11:27,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:11:27,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:11:28,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:11:28,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 22:11:29,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:11:34,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:36,718 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.95 vs. limit=15.0 2023-09-29 22:11:37,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:11:43,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 22:11:44,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:11:44,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:11:50,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 22:11:50,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=505400.0, ans=0.5 2023-09-29 22:11:51,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:11:53,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:11:54,999 INFO [train.py:1039] (0/4) Epoch 15, batch 1450, loss[loss=0.1993, simple_loss=0.2679, pruned_loss=0.06528, over 23815.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2594, pruned_loss=0.05728, over 4703502.26 frames. ], batch size: 164, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:11:56,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:11:58,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=505466.6666666667, ans=0.125 2023-09-29 22:12:01,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:12:01,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:01,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:12:05,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:07,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:12:08,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:12:08,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 22:12:10,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:12:11,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 22:12:12,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:13,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:13,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 22:12:13,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:15,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:12:15,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=505533.3333333333, ans=0.125 2023-09-29 22:12:16,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 22:12:16,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:18,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:12:20,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:20,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=505533.3333333333, ans=0.1 2023-09-29 22:12:23,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:26,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:12:27,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:12:29,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:29,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:30,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:30,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:12:31,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:32,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:33,366 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.94 vs. limit=15.0 2023-09-29 22:12:36,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 22:12:37,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=505600.0, ans=0.1 2023-09-29 22:12:39,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:43,317 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=505666.6666666667, ans=0.125 2023-09-29 22:12:44,385 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 22:12:45,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:12:46,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:12:47,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:12:49,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 22:12:52,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:56,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 22:12:59,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 22:13:00,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:03,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:05,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:05,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=505733.3333333333, ans=0.125 2023-09-29 22:13:06,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 22:13:08,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 22:13:08,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 22:13:09,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:11,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:13:11,822 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=505733.3333333333, ans=0.1 2023-09-29 22:13:16,240 INFO [train.py:1039] (0/4) Epoch 15, batch 1500, loss[loss=0.188, simple_loss=0.26, pruned_loss=0.05796, over 23638.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2601, pruned_loss=0.05725, over 4713298.51 frames. ], batch size: 232, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:13:17,606 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.853e+02 2.114e+02 2.421e+02 4.526e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-29 22:13:21,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 22:13:21,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=505800.0, ans=0.2 2023-09-29 22:13:22,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:13:22,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:13:22,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:24,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:26,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:13:27,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 22:13:29,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:13:29,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:13:29,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:31,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:32,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:13:34,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:38,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:38,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 22:13:39,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:13:40,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:13:40,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:43,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 22:13:45,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=505866.6666666667, ans=0.0 2023-09-29 22:13:48,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 22:13:49,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:51,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 22:13:54,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:13:57,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:13:57,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:57,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:58,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 22:13:58,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:13:58,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:00,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 22:14:02,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:06,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:14:06,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 22:14:09,972 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=506000.0, ans=0.1 2023-09-29 22:14:10,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=506000.0, ans=0.125 2023-09-29 22:14:12,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:14:14,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:14:20,419 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 22:14:20,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:20,492 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 22:14:20,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:20,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=506066.6666666667, ans=0.125 2023-09-29 22:14:22,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:14:23,650 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 22:14:25,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:14:29,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 22:14:31,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:14:36,447 INFO [train.py:1039] (0/4) Epoch 15, batch 1550, loss[loss=0.195, simple_loss=0.2681, pruned_loss=0.06099, over 23646.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2609, pruned_loss=0.0572, over 4732659.86 frames. ], batch size: 149, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:14:36,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 22:14:38,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 22:14:38,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:14:40,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 22:14:40,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 22:14:43,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:44,454 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.55 vs. limit=6.0 2023-09-29 22:14:45,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:46,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:14:46,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:14:48,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:48,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:49,972 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=506133.3333333333, ans=0.1 2023-09-29 22:14:52,860 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 22:14:54,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:54,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:14:54,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:14:57,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:14:57,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 22:14:58,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:59,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 22:15:01,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 22:15:01,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 22:15:02,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:03,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:08,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:15:11,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 22:15:11,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 22:15:13,932 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=506266.6666666667, ans=0.125 2023-09-29 22:15:19,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:22,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:15:24,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:15:24,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:15:25,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 22:15:30,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:15:31,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:32,078 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=506333.3333333333, ans=0.125 2023-09-29 22:15:32,689 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=12.0 2023-09-29 22:15:34,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:15:35,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=506333.3333333333, ans=0.125 2023-09-29 22:15:37,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:15:39,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:39,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 22:15:39,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:15:40,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:15:40,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:42,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:15:42,427 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 22:15:47,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:15:47,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=506400.0, ans=0.95 2023-09-29 22:15:52,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 22:15:57,914 INFO [train.py:1039] (0/4) Epoch 15, batch 1600, loss[loss=0.1746, simple_loss=0.2573, pruned_loss=0.04591, over 24492.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2617, pruned_loss=0.05712, over 4733951.15 frames. ], batch size: 66, lr: 6.95e-03, grad_scale: 32.0 2023-09-29 22:15:59,414 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.908e+02 2.211e+02 2.589e+02 3.896e+02, threshold=4.422e+02, percent-clipped=0.0 2023-09-29 22:15:59,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:16:01,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:01,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 22:16:02,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:16:02,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:16:02,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:16:02,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:16:04,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:16:07,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:09,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 22:16:10,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 22:16:12,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 22:16:15,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:15,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 22:16:16,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:16:19,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:16:23,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:16:28,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 22:16:31,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:16:33,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 22:16:34,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:34,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 22:16:39,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 22:16:42,944 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:16:45,970 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-76000.pt 2023-09-29 22:16:49,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:49,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 22:16:50,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:50,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:50,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:16:52,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 22:16:58,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:17:00,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:17:00,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:00,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:01,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:17:05,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:17:06,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:17:07,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:17:10,296 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=506733.3333333333, ans=0.95 2023-09-29 22:17:10,666 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.68 vs. limit=15.0 2023-09-29 22:17:14,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:14,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:17:17,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 22:17:17,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:17:17,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 22:17:23,494 INFO [train.py:1039] (0/4) Epoch 15, batch 1650, loss[loss=0.1667, simple_loss=0.2469, pruned_loss=0.0432, over 24516.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.262, pruned_loss=0.05739, over 4732810.47 frames. ], batch size: 63, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:17:25,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:25,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:17:27,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:17:27,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 22:17:27,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 22:17:27,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 22:17:28,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 22:17:32,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:32,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:34,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:17:34,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:17:37,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:40,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 22:17:43,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=506866.6666666667, ans=0.0 2023-09-29 22:17:44,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:17:44,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:44,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:17:44,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:17:45,073 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.72 vs. limit=22.5 2023-09-29 22:17:45,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 22:17:45,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 22:17:53,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:17:55,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:18:03,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 22:18:03,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:07,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 22:18:09,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=506933.3333333333, ans=0.125 2023-09-29 22:18:10,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:12,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:18:13,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:18:14,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:14,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:18:14,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:15,451 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.26 vs. limit=10.0 2023-09-29 22:18:17,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:19,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:19,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:19,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:20,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:21,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:18:24,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:26,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 22:18:27,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:27,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 22:18:28,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 22:18:28,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 22:18:28,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:30,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:18:30,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:31,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 22:18:35,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:38,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:18:38,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:39,300 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.43 vs. limit=6.0 2023-09-29 22:18:41,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 22:18:41,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=507066.6666666667, ans=0.09899494936611666 2023-09-29 22:18:44,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=507066.6666666667, ans=0.125 2023-09-29 22:18:47,120 INFO [train.py:1039] (0/4) Epoch 15, batch 1700, loss[loss=0.1789, simple_loss=0.2596, pruned_loss=0.04906, over 24473.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2613, pruned_loss=0.05793, over 4710217.72 frames. ], batch size: 63, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:18:48,609 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 1.972e+02 2.169e+02 2.497e+02 4.927e+02, threshold=4.339e+02, percent-clipped=2.0 2023-09-29 22:18:48,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:48,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:18:48,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 22:18:50,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:18:50,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:18:51,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:54,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:18:54,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:18:54,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 22:18:57,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:19:05,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:07,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:19:14,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:19:14,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:14,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:19:14,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:16,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=507200.0, ans=0.125 2023-09-29 22:19:18,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=507266.6666666667, ans=0.1 2023-09-29 22:19:19,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 22:19:21,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:19:21,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:22,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:19:24,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:19:26,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 22:19:26,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 22:19:27,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:29,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 22:19:30,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:19:30,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=507266.6666666667, ans=0.125 2023-09-29 22:19:38,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:38,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:40,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:41,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:19:41,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 22:19:41,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:45,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:45,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 22:19:45,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:19:45,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:47,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:47,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:19:50,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:50,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:19:51,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:51,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:19:51,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:56,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:57,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 22:19:59,702 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=507400.0, ans=0.125 2023-09-29 22:20:00,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:00,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:04,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 22:20:08,729 INFO [train.py:1039] (0/4) Epoch 15, batch 1750, loss[loss=0.1649, simple_loss=0.2093, pruned_loss=0.06023, over 19043.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2594, pruned_loss=0.05748, over 4695554.59 frames. ], batch size: 389, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:20:08,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:11,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:11,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:20:14,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 22:20:14,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:20:15,999 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=507466.6666666667, ans=0.125 2023-09-29 22:20:17,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:20:17,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=507466.6666666667, ans=0.1 2023-09-29 22:20:18,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:24,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 22:20:26,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:28,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 22:20:28,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:20:29,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:20:34,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:20:35,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 22:20:37,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:20:37,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 22:20:39,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=507533.3333333333, ans=0.125 2023-09-29 22:20:46,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:20:50,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:20:50,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:53,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:53,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:56,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:57,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:00,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:02,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:02,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 22:21:03,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:06,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 22:21:07,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:10,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:11,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:21:16,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:21:16,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 22:21:16,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:19,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:22,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:25,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:27,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:21:27,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=507733.3333333333, ans=0.0 2023-09-29 22:21:29,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 22:21:29,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:30,900 INFO [train.py:1039] (0/4) Epoch 15, batch 1800, loss[loss=0.1565, simple_loss=0.2313, pruned_loss=0.04086, over 16735.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2587, pruned_loss=0.0572, over 4678098.31 frames. ], batch size: 36, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:21:30,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:21:30,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:31,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:21:31,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:21:31,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:21:34,570 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.864e+02 2.038e+02 2.350e+02 3.855e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-29 22:21:34,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:21:35,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=507800.0, ans=0.125 2023-09-29 22:21:36,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:37,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:21:40,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:42,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:21:42,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=507800.0, ans=0.1 2023-09-29 22:21:45,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:21:48,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:21:51,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:51,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:51,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=507866.6666666667, ans=0.125 2023-09-29 22:21:53,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:21:54,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:54,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 22:21:56,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:56,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=507866.6666666667, ans=0.125 2023-09-29 22:21:59,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:00,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=507866.6666666667, ans=0.0 2023-09-29 22:22:03,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 22:22:03,864 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=507933.3333333333, ans=0.0 2023-09-29 22:22:07,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 22:22:07,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 22:22:07,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:09,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:22:09,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:10,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:22:17,295 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 22:22:17,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:22:19,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=508000.0, ans=0.125 2023-09-29 22:22:20,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:22,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 22:22:23,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 22:22:23,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:22:25,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:22:27,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:22:28,947 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.93 vs. limit=15.0 2023-09-29 22:22:33,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 22:22:33,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=508000.0, ans=0.1 2023-09-29 22:22:38,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:22:39,542 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.99 vs. limit=22.5 2023-09-29 22:22:40,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 22:22:42,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:22:42,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:42,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:22:43,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 22:22:46,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:22:46,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:22:49,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 22:22:49,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:50,020 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.58 vs. limit=15.0 2023-09-29 22:22:50,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:51,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:22:51,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:51,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:52,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:22:54,111 INFO [train.py:1039] (0/4) Epoch 15, batch 1850, loss[loss=0.1944, simple_loss=0.2714, pruned_loss=0.0587, over 24445.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2593, pruned_loss=0.05659, over 4698747.04 frames. ], batch size: 69, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:22:54,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=508133.3333333333, ans=0.1 2023-09-29 22:22:55,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:55,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:58,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:23:00,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:08,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:23:08,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 22:23:15,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 22:23:16,675 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.48 vs. limit=15.0 2023-09-29 22:23:18,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 22:23:23,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:23,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 22:23:23,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 22:23:32,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=508266.6666666667, ans=0.1 2023-09-29 22:23:32,039 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=508266.6666666667, ans=0.125 2023-09-29 22:23:33,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:23:33,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 22:23:36,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:23:37,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:23:41,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 22:23:41,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:41,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:23:43,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:23:45,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:47,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:23:47,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=508333.3333333333, ans=0.2 2023-09-29 22:23:48,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=508333.3333333333, ans=0.0 2023-09-29 22:23:51,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:23:51,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:52,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:23:52,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:53,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:23:55,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:23:58,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 22:24:00,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:24:02,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=508400.0, ans=0.125 2023-09-29 22:24:03,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:24:05,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:24:05,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 22:24:05,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 22:24:06,692 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 22:24:07,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=508400.0, ans=15.0 2023-09-29 22:24:08,180 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 22:24:09,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:24:09,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:24:09,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:09,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:09,936 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 22:24:09,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:24:11,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:11,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:24:15,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:24:17,056 INFO [train.py:1039] (0/4) Epoch 15, batch 1900, loss[loss=0.1835, simple_loss=0.2572, pruned_loss=0.05488, over 23553.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2604, pruned_loss=0.05664, over 4710250.92 frames. ], batch size: 135, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:24:17,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:24:17,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 22:24:18,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:18,822 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 22:24:18,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:24:20,831 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.037e+02 2.291e+02 2.918e+02 4.608e+02, threshold=4.583e+02, percent-clipped=3.0 2023-09-29 22:24:20,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:25,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:27,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=508466.6666666667, ans=0.2 2023-09-29 22:24:28,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:24:29,946 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 22:24:30,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 22:24:33,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:33,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:24:34,978 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 22:24:35,032 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 22:24:39,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 22:24:41,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:24:41,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=508533.3333333333, ans=0.125 2023-09-29 22:24:44,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 22:24:45,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 22:24:46,468 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.52 vs. limit=15.0 2023-09-29 22:24:47,655 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:24:59,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 22:25:02,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 22:25:02,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:02,786 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 22:25:02,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 22:25:02,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=508600.0, ans=0.035 2023-09-29 22:25:03,337 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.00 vs. limit=15.0 2023-09-29 22:25:04,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 22:25:04,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 22:25:04,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:04,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=508666.6666666667, ans=0.0 2023-09-29 22:25:04,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=508666.6666666667, ans=0.1 2023-09-29 22:25:09,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 22:25:12,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:25:16,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:16,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 22:25:18,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:25:18,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=508666.6666666667, ans=0.025 2023-09-29 22:25:21,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 22:25:21,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:25,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=508733.3333333333, ans=0.125 2023-09-29 22:25:28,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:25:28,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:25:28,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:25:30,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:25:32,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:25:32,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:25:33,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:25:34,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=508733.3333333333, ans=0.125 2023-09-29 22:25:36,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:36,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:25:38,209 INFO [train.py:1039] (0/4) Epoch 15, batch 1950, loss[loss=0.1877, simple_loss=0.2668, pruned_loss=0.05429, over 23910.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2618, pruned_loss=0.0568, over 4726131.88 frames. ], batch size: 86, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:25:39,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:25:39,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:41,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:41,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:44,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:25:48,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:25:48,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:48,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:25:49,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 22:25:51,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:25:51,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:53,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:54,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:25:56,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:25:56,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:58,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:26:01,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:26:01,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:26:01,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:26:01,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:06,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:08,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:26:08,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:08,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:26:08,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 22:26:09,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:26:09,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:26:10,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:13,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:16,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:26:21,358 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=508933.3333333333, ans=0.09899494936611666 2023-09-29 22:26:22,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:26:25,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:26:25,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:26:25,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 22:26:25,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:26:32,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:26:33,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:26:35,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:26:42,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:44,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:46,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:49,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:51,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:26:52,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:53,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 22:26:53,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:26:53,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:56,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 22:26:58,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:26:59,706 INFO [train.py:1039] (0/4) Epoch 15, batch 2000, loss[loss=0.1984, simple_loss=0.2596, pruned_loss=0.06858, over 23760.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2629, pruned_loss=0.05739, over 4718898.37 frames. ], batch size: 212, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:27:02,736 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.867e+02 2.115e+02 2.554e+02 3.825e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 22:27:02,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:27:05,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:27:05,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:06,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:27:09,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:13,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 22:27:13,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:27:17,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=509200.0, ans=0.125 2023-09-29 22:27:18,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:27:21,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 22:27:21,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:27:21,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:27:24,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:27:25,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 22:27:27,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:27,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:28,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:29,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 22:27:29,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:27:31,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 22:27:31,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:35,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:27:36,603 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.26 vs. limit=22.5 2023-09-29 22:27:37,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:27:37,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:39,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:39,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:27:40,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 22:27:44,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 22:27:44,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:44,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:27:50,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:51,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:27:51,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:52,440 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.14 vs. limit=15.0 2023-09-29 22:27:52,465 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.29 vs. limit=15.0 2023-09-29 22:27:53,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:54,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:54,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:56,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:56,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:57,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:00,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:28:02,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 22:28:07,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:28:09,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:28:16,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:17,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:17,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:19,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:28:19,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:28:23,137 INFO [train.py:1039] (0/4) Epoch 15, batch 2050, loss[loss=0.1619, simple_loss=0.2391, pruned_loss=0.04239, over 24585.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2618, pruned_loss=0.05713, over 4720592.89 frames. ], batch size: 60, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:28:23,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:24,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:27,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:27,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:31,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:28:34,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:28:34,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:34,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:28:37,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 22:28:37,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:28:38,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:28:38,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:28:50,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:28:50,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:54,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 22:28:54,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=509600.0, ans=0.0 2023-09-29 22:28:55,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:57,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 22:28:57,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:29:02,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:04,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:06,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:29:06,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:08,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:29:09,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:29:11,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:29:14,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:15,314 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.23 vs. limit=15.0 2023-09-29 22:29:16,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:29:18,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:29:18,515 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=509666.6666666667, ans=0.1 2023-09-29 22:29:20,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=509666.6666666667, ans=0.125 2023-09-29 22:29:21,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:29:23,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:31,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:29:32,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 22:29:36,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:37,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:29:40,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:29:42,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 22:29:43,639 INFO [train.py:1039] (0/4) Epoch 15, batch 2100, loss[loss=0.1828, simple_loss=0.2371, pruned_loss=0.06429, over 23496.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2606, pruned_loss=0.05668, over 4716467.33 frames. ], batch size: 285, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:29:45,586 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 22:29:45,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:45,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:46,885 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.817e+02 2.090e+02 2.571e+02 3.864e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 22:29:47,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:29:48,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:48,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 22:29:48,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 22:29:50,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:55,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:29:56,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:29:57,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:59,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:29:59,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 22:30:01,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:30:01,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 22:30:01,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 22:30:04,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:04,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:04,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 22:30:04,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 22:30:09,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 22:30:09,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:30:13,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:30:14,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:30:18,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:30:18,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 22:30:20,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:20,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:30:21,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 22:30:21,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:21,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 22:30:21,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 22:30:23,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 22:30:26,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:30:30,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:30:32,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:33,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:35,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:37,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 22:30:37,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:37,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:37,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 22:30:40,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 22:30:40,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 22:30:43,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:30:46,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:46,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 22:30:51,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:54,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:30:54,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=510066.6666666667, ans=0.125 2023-09-29 22:30:56,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:30:56,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:30:56,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:30:56,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:30:56,747 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.64 vs. limit=15.0 2023-09-29 22:30:57,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:57,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:30:59,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:30:59,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:03,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 22:31:07,263 INFO [train.py:1039] (0/4) Epoch 15, batch 2150, loss[loss=0.2048, simple_loss=0.2741, pruned_loss=0.06778, over 23333.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2597, pruned_loss=0.05645, over 4715780.53 frames. ], batch size: 93, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:31:07,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 22:31:07,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:08,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:31:08,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:31:09,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:31:09,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:31:15,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:31:18,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:18,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:18,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=510133.3333333333, ans=0.0 2023-09-29 22:31:20,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:31:20,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:20,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:31:23,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:24,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:31:24,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:31:27,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:27,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 22:31:29,768 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys.whitening_limit, batch_count=510200.0, ans=6.0 2023-09-29 22:31:32,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:34,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:31:36,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:36,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:36,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:38,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:31:38,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:38,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:31:40,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:31:42,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 22:31:43,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:31:43,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:45,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:45,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:31:46,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:31:48,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:50,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:31:50,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=510266.6666666667, ans=0.035 2023-09-29 22:31:51,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:51,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 22:31:51,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:31:54,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:56,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:57,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:59,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:31:59,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:00,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:00,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 22:32:02,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 22:32:02,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:32:02,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=510333.3333333333, ans=0.125 2023-09-29 22:32:03,799 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 22:32:03,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:04,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:32:07,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 22:32:07,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:32:07,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 22:32:07,356 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 22:32:07,357 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 22:32:07,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 22:32:09,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=510333.3333333333, ans=0.0 2023-09-29 22:32:11,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:11,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:32:12,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:32:12,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:14,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:32:14,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:14,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:18,658 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.47 vs. limit=22.5 2023-09-29 22:32:22,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:32:23,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 22:32:25,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:32:28,654 INFO [train.py:1039] (0/4) Epoch 15, batch 2200, loss[loss=0.1998, simple_loss=0.2734, pruned_loss=0.06315, over 23397.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2607, pruned_loss=0.05726, over 4701405.14 frames. ], batch size: 93, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:32:31,713 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.917e+02 2.133e+02 2.422e+02 4.121e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 22:32:31,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:31,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:32:32,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:32:33,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:32:35,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:35,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:32:35,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 22:32:42,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 22:32:44,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:32:49,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 22:32:52,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:53,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:32:53,327 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=510533.3333333333, ans=0.125 2023-09-29 22:32:54,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:33:00,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:33:00,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 22:33:04,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:33:06,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:06,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 22:33:10,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:33:11,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:14,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:33:14,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:16,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 22:33:18,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:20,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 22:33:22,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:23,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:33:23,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:26,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:33:28,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:28,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:28,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:29,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:33:30,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:33:32,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:33:36,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:33:37,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:33:39,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:33:40,661 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 22:33:42,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:33:42,425 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 22:33:43,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:33:45,337 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 22:33:47,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:47,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:33:50,578 INFO [train.py:1039] (0/4) Epoch 15, batch 2250, loss[loss=0.1971, simple_loss=0.2653, pruned_loss=0.06448, over 23588.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2625, pruned_loss=0.05815, over 4701168.68 frames. ], batch size: 149, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:33:50,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:51,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=510800.0, ans=0.1 2023-09-29 22:33:52,575 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 22:33:54,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:33:56,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:33:57,936 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.95 vs. limit=22.5 2023-09-29 22:34:01,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:34:01,931 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=510800.0, ans=0.125 2023-09-29 22:34:03,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:34:07,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:07,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:09,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:34:10,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 22:34:12,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:12,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:34:13,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 22:34:15,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:34:15,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:17,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:23,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:24,139 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=510933.3333333333, ans=0.0 2023-09-29 22:34:25,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:34:25,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:34:26,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 22:34:26,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:27,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=510933.3333333333, ans=0.0 2023-09-29 22:34:30,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:34:34,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:35,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:38,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:34:38,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:41,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:43,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:34:47,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:34:48,134 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=511000.0, ans=0.2 2023-09-29 22:34:50,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:34:52,599 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=511000.0, ans=0.125 2023-09-29 22:34:56,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:34:56,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:34:58,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:34:59,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=511066.6666666667, ans=0.0 2023-09-29 22:34:59,859 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=7.01 vs. limit=12.0 2023-09-29 22:35:04,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:35:08,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:35:08,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 22:35:08,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:08,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:35:11,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 22:35:12,807 INFO [train.py:1039] (0/4) Epoch 15, batch 2300, loss[loss=0.1798, simple_loss=0.2559, pruned_loss=0.0518, over 24637.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2627, pruned_loss=0.05814, over 4707600.10 frames. ], batch size: 60, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:35:14,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:35:15,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:19,133 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.881e+02 2.170e+02 2.531e+02 3.802e+02, threshold=4.341e+02, percent-clipped=0.0 2023-09-29 22:35:20,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:22,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:35:23,844 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 22:35:25,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:32,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:35:32,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:35:32,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:35:33,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:33,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 22:35:35,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:35:40,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:35:40,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:35:45,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:35:48,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:35:51,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:35:57,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:35:57,952 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.15 vs. limit=12.0 2023-09-29 22:35:58,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:36:00,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:36:01,273 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.66 vs. limit=10.0 2023-09-29 22:36:02,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=511333.3333333333, ans=0.1 2023-09-29 22:36:03,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:06,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=511333.3333333333, ans=0.125 2023-09-29 22:36:07,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:36:08,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:36:08,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:36:08,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 22:36:12,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:36:12,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:13,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:13,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:36:13,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:15,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:36:15,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:36:15,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 22:36:15,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:36:15,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:15,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 22:36:21,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:36:26,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:36:30,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:30,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:36:30,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:36:32,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:36:32,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:36:32,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:36:33,775 INFO [train.py:1039] (0/4) Epoch 15, batch 2350, loss[loss=0.1679, simple_loss=0.2422, pruned_loss=0.04684, over 24446.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2638, pruned_loss=0.05889, over 4711406.18 frames. ], batch size: 58, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:36:33,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 22:36:41,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:36:42,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 22:36:48,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 22:36:50,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:52,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=511533.3333333333, ans=0.125 2023-09-29 22:36:52,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=511533.3333333333, ans=0.04949747468305833 2023-09-29 22:36:55,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:36:55,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:56,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 22:37:01,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:37:04,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=511600.0, ans=0.0 2023-09-29 22:37:07,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 22:37:09,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:37:12,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:37:12,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:37:15,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:37:17,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 22:37:17,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:37:19,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:37:19,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:19,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:37:24,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:37:28,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 22:37:28,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:37:30,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:37:31,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:37:32,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=511666.6666666667, ans=0.1 2023-09-29 22:37:33,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 22:37:33,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:37:36,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 22:37:36,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:37:39,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 22:37:41,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=511733.3333333333, ans=0.0 2023-09-29 22:37:43,432 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.81 vs. limit=6.0 2023-09-29 22:37:44,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 22:37:44,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:44,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:37:44,384 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 22:37:45,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 22:37:48,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 22:37:52,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:37:56,456 INFO [train.py:1039] (0/4) Epoch 15, batch 2400, loss[loss=0.1783, simple_loss=0.2644, pruned_loss=0.04608, over 24690.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2627, pruned_loss=0.05812, over 4722250.93 frames. ], batch size: 73, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:37:58,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:38:01,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:38:03,208 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.881e+02 2.096e+02 2.397e+02 4.111e+02, threshold=4.192e+02, percent-clipped=0.0 2023-09-29 22:38:03,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:38:03,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 22:38:03,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 22:38:03,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=511800.0, ans=0.125 2023-09-29 22:38:08,362 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=511800.0, ans=0.125 2023-09-29 22:38:12,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:38:12,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:15,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 22:38:15,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:38:17,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:17,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 22:38:23,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:25,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 22:38:30,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:38:30,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=511933.3333333333, ans=0.125 2023-09-29 22:38:37,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 22:38:38,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:38:40,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:42,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=511933.3333333333, ans=0.125 2023-09-29 22:38:43,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:38:43,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 22:38:43,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:38:51,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:53,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:38:56,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:58,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:38:58,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:38:58,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:38:58,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:58,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:00,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:39:05,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:07,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:39:07,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 22:39:07,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=512066.6666666667, ans=0.0 2023-09-29 22:39:08,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 22:39:10,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:39:10,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:39:10,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 22:39:11,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 22:39:11,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 22:39:11,955 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 22:39:13,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 22:39:14,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:39:16,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:16,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:17,955 INFO [train.py:1039] (0/4) Epoch 15, batch 2450, loss[loss=0.1651, simple_loss=0.2267, pruned_loss=0.05175, over 23601.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2611, pruned_loss=0.0574, over 4729839.05 frames. ], batch size: 256, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:39:18,106 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 22:39:18,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:19,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:39:22,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:39:22,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:26,523 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.10 vs. limit=15.0 2023-09-29 22:39:27,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:27,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:27,833 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=512133.3333333333, ans=0.035 2023-09-29 22:39:29,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 22:39:29,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=512133.3333333333, ans=0.2 2023-09-29 22:39:34,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:39:34,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:35,123 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.45 vs. limit=6.0 2023-09-29 22:39:37,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:39:37,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:39:37,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:39:39,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 22:39:43,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:46,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:39:47,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:51,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=512266.6666666667, ans=0.0 2023-09-29 22:39:52,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:39:52,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:52,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=512266.6666666667, ans=0.125 2023-09-29 22:39:54,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:55,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:57,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 22:39:57,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:40:07,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:09,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:40:09,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:09,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:40:11,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:12,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:40:12,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 22:40:14,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:40:16,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:40:19,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:40:19,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:24,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:40:24,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 22:40:26,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:40:26,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:40:27,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 22:40:29,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:40:29,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:40:31,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=512400.0, ans=0.0 2023-09-29 22:40:32,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:40:34,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:34,465 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=512400.0, ans=0.125 2023-09-29 22:40:35,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:40:39,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 22:40:41,385 INFO [train.py:1039] (0/4) Epoch 15, batch 2500, loss[loss=0.1901, simple_loss=0.2267, pruned_loss=0.0768, over 18847.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2609, pruned_loss=0.05706, over 4738971.34 frames. ], batch size: 388, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:40:41,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:40:48,502 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.863e+02 2.026e+02 2.249e+02 3.310e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-29 22:40:48,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:40:56,164 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=512466.6666666667, ans=0.0 2023-09-29 22:40:58,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:40:58,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:41:00,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:41:00,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 22:41:05,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=512533.3333333333, ans=0.125 2023-09-29 22:41:06,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=512533.3333333333, ans=0.125 2023-09-29 22:41:07,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:41:08,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:08,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:41:08,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 22:41:10,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 22:41:11,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:11,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:11,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 22:41:13,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:13,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 22:41:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:20,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:41:20,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:22,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:41:24,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 22:41:25,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:41:27,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:31,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:32,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=512666.6666666667, ans=0.125 2023-09-29 22:41:36,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:38,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=512666.6666666667, ans=0.125 2023-09-29 22:41:39,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:44,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:41:46,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 22:41:48,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:48,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:41:48,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=512733.3333333333, ans=0.125 2023-09-29 22:41:50,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:41:50,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:41:50,196 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 22:41:50,197 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 22:41:50,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 22:41:54,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:54,943 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=512733.3333333333, ans=0.2 2023-09-29 22:41:56,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 22:41:56,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 22:41:57,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:59,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 22:42:03,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 22:42:04,948 INFO [train.py:1039] (0/4) Epoch 15, batch 2550, loss[loss=0.1772, simple_loss=0.2637, pruned_loss=0.04533, over 24325.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2612, pruned_loss=0.05701, over 4735042.61 frames. ], batch size: 61, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:42:07,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:10,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:42:10,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:42:13,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:15,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 22:42:15,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:42:15,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=512800.0, ans=0.0 2023-09-29 22:42:16,026 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.83 vs. limit=15.0 2023-09-29 22:42:19,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 22:42:21,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:42:24,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:26,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:42:26,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 22:42:28,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:42:28,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:29,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:42:32,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:42:32,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 22:42:32,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:42:32,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:32,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 22:42:42,889 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.79 vs. limit=22.5 2023-09-29 22:42:45,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:42:47,472 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.47 vs. limit=10.0 2023-09-29 22:42:51,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:42:51,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:51,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:52,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:42:58,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:43:01,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:43:01,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:43:02,331 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:43:03,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:43:03,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:43:03,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:43:07,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:07,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:11,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:43:11,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 22:43:11,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:43:11,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:13,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:43:15,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:43:15,280 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=513066.6666666667, ans=0.1 2023-09-29 22:43:18,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:23,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:43:24,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:27,863 INFO [train.py:1039] (0/4) Epoch 15, batch 2600, loss[loss=0.1894, simple_loss=0.2579, pruned_loss=0.06049, over 23840.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2611, pruned_loss=0.05706, over 4730040.28 frames. ], batch size: 195, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:43:28,077 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 22:43:31,729 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=513133.3333333333, ans=0.0 2023-09-29 22:43:32,890 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 22:43:32,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:43:34,984 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.927e+02 2.129e+02 2.377e+02 3.619e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-29 22:43:35,111 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 22:43:35,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 22:43:35,283 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 22:43:35,603 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=513133.3333333333, ans=0.125 2023-09-29 22:43:38,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=513133.3333333333, ans=0.125 2023-09-29 22:43:39,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:39,768 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 22:43:39,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 22:43:42,045 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 22:43:45,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:43:45,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 22:43:48,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 22:43:49,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:43:49,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 22:43:51,326 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 22:43:51,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 22:44:01,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:01,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:01,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:01,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 22:44:04,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:44:06,401 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=513266.6666666667, ans=0.125 2023-09-29 22:44:06,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=513266.6666666667, ans=0.125 2023-09-29 22:44:10,603 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 22:44:18,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:18,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:19,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 22:44:19,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:19,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:21,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 22:44:24,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:44:24,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:44:26,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,869 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 22:44:29,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:44:33,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=513400.0, ans=0.125 2023-09-29 22:44:36,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:36,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:44:38,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 22:44:38,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:38,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=513400.0, ans=0.125 2023-09-29 22:44:39,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:44:41,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:44:46,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 22:44:48,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:50,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:44:51,962 INFO [train.py:1039] (0/4) Epoch 15, batch 2650, loss[loss=0.2012, simple_loss=0.2812, pruned_loss=0.06058, over 23996.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2622, pruned_loss=0.05777, over 4727167.71 frames. ], batch size: 86, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:44:55,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 22:44:55,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:56,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:44:58,148 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 22:44:58,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:01,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:03,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:45:06,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:45:06,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:45:06,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=513533.3333333333, ans=0.09899494936611666 2023-09-29 22:45:08,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 22:45:08,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:45:08,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:45:10,918 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.45 vs. limit=5.0 2023-09-29 22:45:11,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 22:45:13,358 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 22:45:13,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=513533.3333333333, ans=0.125 2023-09-29 22:45:16,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:17,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 22:45:20,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:20,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 22:45:23,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:23,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:45:25,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:25,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:27,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=513600.0, ans=0.125 2023-09-29 22:45:30,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 22:45:31,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 22:45:33,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:45:38,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 22:45:38,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:39,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:39,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:45:40,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:41,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:41,641 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=513666.6666666667, ans=0.125 2023-09-29 22:45:43,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:44,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:44,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:46,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:45:48,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:45:49,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:49,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:45:51,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:52,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:52,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:45:57,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:58,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:45:58,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:58,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 22:46:03,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:06,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:09,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:46:11,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:13,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:14,397 INFO [train.py:1039] (0/4) Epoch 15, batch 2700, loss[loss=0.1854, simple_loss=0.2698, pruned_loss=0.0505, over 24695.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2625, pruned_loss=0.05792, over 4728809.68 frames. ], batch size: 73, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:46:14,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 22:46:16,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:46:17,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 22:46:19,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:46:19,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:19,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:21,309 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.958e+02 2.156e+02 2.389e+02 4.797e+02, threshold=4.312e+02, percent-clipped=1.0 2023-09-29 22:46:21,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:46:21,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:46:22,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:46:23,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:46:23,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 22:46:24,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:46:25,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:46:27,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:46:28,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:34,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:46:34,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 22:46:34,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:46:40,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:46:40,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:46:47,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:46:47,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:49,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:46:49,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:46:50,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:46:53,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:53,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:46:53,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:46:58,315 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.75 vs. limit=12.0 2023-09-29 22:46:59,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:59,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:47:08,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:47:08,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:47:12,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:47:12,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:16,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514000.0, ans=0.1 2023-09-29 22:47:17,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:17,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:19,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:47:20,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:22,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:22,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:47:22,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=514066.6666666667, ans=0.125 2023-09-29 22:47:25,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:47:28,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:28,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:31,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 22:47:33,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:36,519 INFO [train.py:1039] (0/4) Epoch 15, batch 2750, loss[loss=0.1645, simple_loss=0.2482, pruned_loss=0.04037, over 24479.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2621, pruned_loss=0.05785, over 4723750.52 frames. ], batch size: 66, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:47:36,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:47:36,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 22:47:38,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 22:47:40,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:42,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:47:42,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:45,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:45,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:47:47,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:50,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:47:50,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:47:51,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:47:51,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:51,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 22:47:51,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:47:53,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:58,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 22:48:00,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:48:01,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:01,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:01,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:48:03,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:48:03,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:48:03,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:04,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:06,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514200.0, ans=0.1 2023-09-29 22:48:09,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:48:11,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:48:11,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:48:12,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:14,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:48:15,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=514266.6666666667, ans=0.125 2023-09-29 22:48:20,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:23,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:48:23,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:27,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:27,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:48:28,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:48:33,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=514333.3333333333, ans=0.0 2023-09-29 22:48:35,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:48:35,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:35,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 22:48:37,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=514333.3333333333, ans=0.2 2023-09-29 22:48:39,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:42,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 22:48:50,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:48:51,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:48:51,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 22:48:53,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:48:55,157 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=514400.0, ans=0.125 2023-09-29 22:48:56,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:48:56,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 22:48:56,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:48:59,390 INFO [train.py:1039] (0/4) Epoch 15, batch 2800, loss[loss=0.1859, simple_loss=0.2361, pruned_loss=0.06782, over 22749.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2605, pruned_loss=0.05766, over 4716437.86 frames. ], batch size: 322, lr: 6.89e-03, grad_scale: 32.0 2023-09-29 22:48:59,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:48:59,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:48:59,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=514466.6666666667, ans=0.0 2023-09-29 22:49:00,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:03,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 22:49:03,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:03,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:05,771 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.774e+02 1.954e+02 2.291e+02 3.351e+02, threshold=3.907e+02, percent-clipped=0.0 2023-09-29 22:49:05,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:07,376 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 22:49:07,377 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 22:49:09,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:10,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:49:10,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:49:12,692 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:49:15,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:49:17,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 22:49:19,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:49:20,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 22:49:22,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:22,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:49:22,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:27,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:28,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:28,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:49:28,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:49:39,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:49:39,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:41,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=514600.0, ans=0.2 2023-09-29 22:49:42,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:42,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:49:43,586 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.40 vs. limit=22.5 2023-09-29 22:49:44,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:48,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:49:48,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 22:49:50,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:51,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:51,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:49:54,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:56,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:00,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:50:01,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:50:01,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:01,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:50:01,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:50:03,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:50:04,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:50:04,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 22:50:04,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:05,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:50:05,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:08,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 22:50:10,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:10,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:50:10,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:50:13,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 22:50:19,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:50:19,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:50:21,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:50:22,715 INFO [train.py:1039] (0/4) Epoch 15, batch 2850, loss[loss=0.1596, simple_loss=0.2314, pruned_loss=0.0439, over 21366.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2597, pruned_loss=0.05735, over 4699050.59 frames. ], batch size: 46, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:50:24,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:24,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=514800.0, ans=0.0 2023-09-29 22:50:27,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:50:27,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:50:29,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:50:31,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:33,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:35,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:50:36,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 22:50:43,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 22:50:43,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:45,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 22:50:45,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:49,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 22:50:49,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 22:50:50,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:56,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=514933.3333333333, ans=0.125 2023-09-29 22:51:01,103 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=514933.3333333333, ans=0.125 2023-09-29 22:51:03,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:06,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:06,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:51:07,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:51:07,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:51:09,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:51:10,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:51:10,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 22:51:13,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:51:13,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:14,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:15,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:15,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:17,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:18,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:20,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:23,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:51:24,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:24,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:27,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:51:33,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:51:35,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 22:51:35,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 22:51:38,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:51:38,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:38,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 22:51:38,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:51:40,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:40,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:40,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:51:40,815 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 22:51:42,922 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 22:51:42,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:51:43,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:46,101 INFO [train.py:1039] (0/4) Epoch 15, batch 2900, loss[loss=0.1904, simple_loss=0.2599, pruned_loss=0.06044, over 23469.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.26, pruned_loss=0.05687, over 4705425.30 frames. ], batch size: 120, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:51:49,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:51:49,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:49,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:50,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 22:51:53,872 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.822e+02 2.046e+02 2.406e+02 3.211e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-29 22:51:54,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:54,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 22:51:55,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 22:51:57,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:51:57,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:52:00,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:01,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:52:03,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:52:05,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:52:09,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:52:10,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 22:52:10,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:52:10,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=515200.0, ans=0.1 2023-09-29 22:52:12,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:15,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 22:52:15,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 22:52:20,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:52:20,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 22:52:20,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:52:23,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:52:23,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:52:26,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:26,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:31,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:52:33,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:52:33,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 22:52:33,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 22:52:33,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:52:38,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:52:40,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 22:52:41,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:52:42,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=515333.3333333333, ans=0.1 2023-09-29 22:52:45,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=515333.3333333333, ans=0.125 2023-09-29 22:52:47,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:56,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:52:56,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:52:58,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 22:53:01,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:01,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 22:53:02,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:02,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:53:07,959 INFO [train.py:1039] (0/4) Epoch 15, batch 2950, loss[loss=0.1798, simple_loss=0.2721, pruned_loss=0.04377, over 24426.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2613, pruned_loss=0.05743, over 4699046.85 frames. ], batch size: 69, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:53:09,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:11,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 22:53:12,012 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.91 vs. limit=15.0 2023-09-29 22:53:13,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:13,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:14,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:53:17,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:53:17,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 22:53:17,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 22:53:19,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:53:19,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:25,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=515533.3333333333, ans=0.0 2023-09-29 22:53:26,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:27,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:29,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:53:30,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:31,592 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.47 vs. limit=22.5 2023-09-29 22:53:34,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:53:34,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:53:35,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:53:40,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 22:53:45,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 22:53:45,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 22:53:45,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:53:47,822 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 22:53:49,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 22:53:49,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:50,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:50,895 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 22:53:50,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:53:54,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 22:53:56,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:58,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:53:59,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:01,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:54:02,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:02,951 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 22:54:04,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:04,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 22:54:10,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:10,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:12,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 22:54:12,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:54:14,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 22:54:17,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:20,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:54:20,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:54:22,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:22,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:54:23,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:54:25,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:25,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:54:25,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:54:26,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:27,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:54:29,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:29,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 22:54:31,025 INFO [train.py:1039] (0/4) Epoch 15, batch 3000, loss[loss=0.1941, simple_loss=0.2597, pruned_loss=0.06431, over 23784.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2619, pruned_loss=0.05771, over 4708832.33 frames. ], batch size: 179, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:54:31,025 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 22:54:45,817 INFO [train.py:1071] (0/4) Epoch 15, validation: loss=0.2711, simple_loss=0.2767, pruned_loss=0.1327, over 1125622.00 frames. 2023-09-29 22:54:45,818 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-29 22:54:46,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:51,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:54:51,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:54:53,998 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.966e+02 2.278e+02 2.682e+02 4.156e+02, threshold=4.556e+02, percent-clipped=1.0 2023-09-29 22:54:54,233 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 22:54:54,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 22:54:57,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:57,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:54:57,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 22:54:57,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:55:06,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:55:16,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:55:20,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=515933.3333333333, ans=0.0 2023-09-29 22:55:21,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 22:55:23,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:55:25,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:55:27,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:55:27,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:55:28,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:28,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 22:55:33,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 22:55:33,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:55:35,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:55:37,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:55:37,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:39,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:39,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:55:39,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=516000.0, ans=0.0 2023-09-29 22:55:42,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:55:42,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:42,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:55:45,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:46,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 22:55:48,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:55:48,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:55:48,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:55:51,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:53,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:54,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:55:54,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 22:55:54,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:55:54,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 22:55:56,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:55:58,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 22:56:00,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=516066.6666666667, ans=0.5 2023-09-29 22:56:02,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:02,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 22:56:02,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 22:56:05,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 22:56:05,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:56:07,789 INFO [train.py:1039] (0/4) Epoch 15, batch 3050, loss[loss=0.1838, simple_loss=0.267, pruned_loss=0.05032, over 24634.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2634, pruned_loss=0.05844, over 4708610.94 frames. ], batch size: 73, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:56:07,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:56:10,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:56:10,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:56:10,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:11,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:56:13,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 22:56:15,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:18,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:18,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:56:21,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:24,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 22:56:29,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 22:56:29,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 22:56:31,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:33,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=516200.0, ans=0.2 2023-09-29 22:56:36,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:56:37,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:37,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:39,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:44,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:56:44,512 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=516266.6666666667, ans=0.125 2023-09-29 22:56:44,845 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-09-29 22:56:45,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:47,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:47,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:47,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:47,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:50,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:51,967 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.62 vs. limit=15.0 2023-09-29 22:56:52,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:54,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 22:56:55,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:55,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:56:57,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:58,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:56:58,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:00,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:06,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:57:06,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:13,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:14,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:57:14,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:57:16,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:16,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:57:16,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:57:18,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 22:57:19,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:19,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:19,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 22:57:22,364 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.30 vs. limit=10.0 2023-09-29 22:57:23,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:27,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=516400.0, ans=0.0 2023-09-29 22:57:27,534 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-09-29 22:57:29,723 INFO [train.py:1039] (0/4) Epoch 15, batch 3100, loss[loss=0.1948, simple_loss=0.2687, pruned_loss=0.06045, over 24510.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2622, pruned_loss=0.05828, over 4706998.49 frames. ], batch size: 63, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:57:29,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:31,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:57:33,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:57:33,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=516466.6666666667, ans=0.125 2023-09-29 22:57:34,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 22:57:36,588 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=516466.6666666667, ans=0.0 2023-09-29 22:57:37,779 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.874e+02 2.072e+02 2.284e+02 2.890e+02, threshold=4.143e+02, percent-clipped=0.0 2023-09-29 22:57:37,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 22:57:40,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 22:57:40,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:57:44,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:46,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:47,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:57:54,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:58,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 22:58:05,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:58:05,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:05,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:07,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:08,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:58:10,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:58:10,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 22:58:10,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:58:10,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:11,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 22:58:13,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:58:16,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:58:17,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 22:58:18,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=516666.6666666667, ans=0.0 2023-09-29 22:58:18,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=516666.6666666667, ans=0.125 2023-09-29 22:58:20,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 22:58:20,364 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:58:21,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:21,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:23,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:23,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:23,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:58:26,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:58:26,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:58:29,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:58:29,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:58:29,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:29,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 22:58:34,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:35,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 22:58:37,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:58:38,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 22:58:40,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:40,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:41,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 22:58:50,898 INFO [train.py:1039] (0/4) Epoch 15, batch 3150, loss[loss=0.1869, simple_loss=0.2527, pruned_loss=0.06059, over 23840.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2612, pruned_loss=0.05771, over 4710975.14 frames. ], batch size: 212, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:58:51,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 22:58:53,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:58:54,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:56,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:56,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:58:57,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=516800.0, ans=0.125 2023-09-29 22:58:58,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 22:58:58,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=516800.0, ans=0.0 2023-09-29 22:59:00,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:00,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:59:00,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 22:59:02,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=516800.0, ans=0.125 2023-09-29 22:59:03,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:05,323 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 22:59:10,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 22:59:10,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:11,726 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 22:59:11,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:59:13,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 22:59:14,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 22:59:14,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 22:59:14,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:14,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:16,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:18,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=516866.6666666667, ans=0.0 2023-09-29 22:59:19,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 22:59:20,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:22,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:22,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:23,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:59:27,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 22:59:27,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:59:28,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=516933.3333333333, ans=0.025 2023-09-29 22:59:29,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:59:31,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:31,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 22:59:34,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 22:59:34,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:59:36,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:59:36,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:59:36,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:36,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:59:38,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:59:38,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:59:40,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 22:59:40,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:59:40,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:43,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:59:43,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:44,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 22:59:44,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:46,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 22:59:46,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:47,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 22:59:49,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 22:59:49,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:59:51,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:51,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 22:59:52,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:59:52,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:57,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:58,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:58,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:00:05,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:00:06,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:09,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 23:00:11,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=517066.6666666667, ans=0.0 2023-09-29 23:00:12,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:00:12,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 23:00:14,362 INFO [train.py:1039] (0/4) Epoch 15, batch 3200, loss[loss=0.1927, simple_loss=0.26, pruned_loss=0.06266, over 23328.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2602, pruned_loss=0.05719, over 4716801.43 frames. ], batch size: 119, lr: 6.87e-03, grad_scale: 32.0 2023-09-29 23:00:16,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:17,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:00:17,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 23:00:20,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:00:22,258 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.845e+02 1.990e+02 2.356e+02 4.554e+02, threshold=3.981e+02, percent-clipped=2.0 2023-09-29 23:00:23,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:00:24,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=517133.3333333333, ans=0.1 2023-09-29 23:00:28,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:36,502 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=517200.0, ans=0.2 2023-09-29 23:00:37,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:00:44,833 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=517200.0, ans=0.0 2023-09-29 23:00:44,871 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=517200.0, ans=0.2 2023-09-29 23:00:46,407 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=517266.6666666667, ans=0.125 2023-09-29 23:00:49,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 23:00:52,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:00:55,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 23:00:56,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:00:59,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=517266.6666666667, ans=0.07 2023-09-29 23:01:01,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:01:01,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:01:03,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:01:04,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 23:01:07,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 23:01:07,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 23:01:12,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 23:01:15,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:01:22,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:22,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:01:22,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:23,817 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 23:01:23,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:01:27,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:27,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 23:01:28,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 23:01:28,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 23:01:30,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 23:01:32,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:01:33,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:01:35,027 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 23:01:35,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:01:35,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:01:36,530 INFO [train.py:1039] (0/4) Epoch 15, batch 3250, loss[loss=0.2208, simple_loss=0.2955, pruned_loss=0.07301, over 24386.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2601, pruned_loss=0.05738, over 4703133.87 frames. ], batch size: 77, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:01:36,695 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 23:01:40,493 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.86 vs. limit=22.5 2023-09-29 23:01:43,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:01:45,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:01:46,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=517466.6666666667, ans=0.035 2023-09-29 23:01:54,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:01:54,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 23:01:54,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:55,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:55,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:01:57,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:01:57,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:02:00,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:00,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:02:01,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:01,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:03,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:04,374 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.32 vs. limit=15.0 2023-09-29 23:02:06,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:02:09,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:09,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:11,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:11,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:02:11,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:17,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 23:02:18,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:02:18,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:02:20,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:20,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:02:28,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:02:30,594 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=517666.6666666667, ans=0.04949747468305833 2023-09-29 23:02:38,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:38,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:38,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 23:02:38,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:02:38,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:02:39,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:39,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=517666.6666666667, ans=0.0 2023-09-29 23:02:41,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 23:02:42,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 23:02:42,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:44,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:44,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:45,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:02:45,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:49,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:49,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:02:51,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 23:02:51,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:02:53,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:02:53,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 23:02:53,377 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=517733.3333333333, ans=0.0 2023-09-29 23:02:55,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=517733.3333333333, ans=0.0 2023-09-29 23:02:58,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:58,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 23:02:59,920 INFO [train.py:1039] (0/4) Epoch 15, batch 3300, loss[loss=0.2676, simple_loss=0.3126, pruned_loss=0.1113, over 19577.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2616, pruned_loss=0.05809, over 4704819.23 frames. ], batch size: 388, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:03:02,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 23:03:03,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 23:03:03,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:08,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:03:08,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=517800.0, ans=0.0 2023-09-29 23:03:09,617 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.939e+02 2.168e+02 2.538e+02 3.579e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 23:03:09,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:03:09,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:11,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:03:11,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:03:16,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:18,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:03:22,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 23:03:22,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:22,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:23,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:23,881 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 23:03:24,485 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.44 vs. limit=12.0 2023-09-29 23:03:25,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:03:26,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:03:28,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:03:28,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:03:28,287 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 23:03:32,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:32,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:03:34,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:34,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 23:03:36,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:03:36,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:38,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:03:41,187 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 23:03:42,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 23:03:44,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:03:44,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=517933.3333333333, ans=0.1 2023-09-29 23:03:47,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 23:03:48,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:03:50,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:03:50,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:03:52,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:53,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:53,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:53,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:03:54,298 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.15 vs. limit=15.0 2023-09-29 23:03:55,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:03:55,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:56,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:03:57,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=518000.0, ans=0.0 2023-09-29 23:03:59,855 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 23:04:01,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 23:04:03,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:04:05,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:05,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:07,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:04:07,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:11,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:04:11,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:11,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:04:12,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:04:14,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:04:14,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=518066.6666666667, ans=0.2 2023-09-29 23:04:15,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 23:04:17,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:18,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:20,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:04:20,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:04:21,761 INFO [train.py:1039] (0/4) Epoch 15, batch 3350, loss[loss=0.1688, simple_loss=0.2507, pruned_loss=0.04344, over 24444.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2622, pruned_loss=0.05833, over 4716683.86 frames. ], batch size: 63, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:04:21,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:22,756 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.37 vs. limit=15.0 2023-09-29 23:04:24,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:24,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:26,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:04:26,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=518133.3333333333, ans=0.0 2023-09-29 23:04:28,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=518133.3333333333, ans=0.2 2023-09-29 23:04:29,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:30,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:04:32,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:35,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:04:37,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:39,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:04:40,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 23:04:42,689 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 23:04:42,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:47,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 23:04:47,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 23:04:48,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:04:48,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:04:50,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:50,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 23:04:50,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:51,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:04:52,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:54,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:55,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:56,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:04:59,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:04:59,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:01,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:05,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:05:07,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:10,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:10,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:11,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:15,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 23:05:15,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:05:15,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 23:05:15,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:05:18,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 23:05:19,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:21,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:28,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:29,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 23:05:31,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:05:31,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:05:32,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:05:39,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:40,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 23:05:42,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:05:42,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:05:42,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:43,760 INFO [train.py:1039] (0/4) Epoch 15, batch 3400, loss[loss=0.2021, simple_loss=0.2682, pruned_loss=0.06804, over 23605.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2636, pruned_loss=0.05883, over 4710047.77 frames. ], batch size: 149, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:05:43,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 23:05:43,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:43,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 23:05:46,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:46,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:46,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:05:48,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:05:48,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 23:05:54,290 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.906e+02 2.208e+02 2.568e+02 3.814e+02, threshold=4.417e+02, percent-clipped=0.0 2023-09-29 23:05:54,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 23:05:54,426 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 23:05:54,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:59,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:59,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:06:00,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:02,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:06:06,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:09,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 23:06:10,931 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.44 vs. limit=10.0 2023-09-29 23:06:14,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:06:15,187 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=518600.0, ans=0.125 2023-09-29 23:06:16,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:16,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:16,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:06:21,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=518600.0, ans=0.04949747468305833 2023-09-29 23:06:26,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:06:31,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 23:06:37,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:38,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:39,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 23:06:39,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:06:40,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:41,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:42,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:06:46,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:48,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:06:48,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:06:54,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:06:56,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 23:07:02,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:07:05,196 INFO [train.py:1039] (0/4) Epoch 15, batch 3450, loss[loss=0.181, simple_loss=0.2521, pruned_loss=0.05492, over 23452.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2632, pruned_loss=0.05879, over 4708933.48 frames. ], batch size: 119, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:07:05,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 23:07:08,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=518800.0, ans=0.0 2023-09-29 23:07:10,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 23:07:10,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:13,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:07:13,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 23:07:13,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:07:16,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:07:21,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:07:22,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:23,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:07:23,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:26,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:33,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 23:07:36,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=518933.3333333333, ans=0.5 2023-09-29 23:07:39,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 23:07:39,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:07:39,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:07:40,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:42,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=518933.3333333333, ans=0.125 2023-09-29 23:07:46,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 23:07:48,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:07:53,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:07:53,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:53,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:07:54,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:07:55,254 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=519000.0, ans=0.125 2023-09-29 23:07:57,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 23:07:57,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:07:59,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:08:02,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:04,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 23:08:10,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:08:16,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:08:17,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:19,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:21,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:23,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:08:23,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:08:23,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:08:25,849 INFO [train.py:1039] (0/4) Epoch 15, batch 3500, loss[loss=0.1766, simple_loss=0.2242, pruned_loss=0.06454, over 19652.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2614, pruned_loss=0.05774, over 4708956.92 frames. ], batch size: 388, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:08:28,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:32,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:08:33,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 23:08:35,737 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.112e+02 2.557e+02 4.010e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:08:35,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:08:36,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=519133.3333333333, ans=0.5 2023-09-29 23:08:39,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:08:41,950 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=519200.0, ans=0.0 2023-09-29 23:08:43,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:43,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 23:08:47,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:08:49,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:50,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:08:50,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:08:50,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:08:52,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:52,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:08:52,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 23:08:54,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:55,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:08:55,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:09:00,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:01,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 23:09:01,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:09:04,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:09:06,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:09:06,526 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=519266.6666666667, ans=0.0 2023-09-29 23:09:07,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:07,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=519266.6666666667, ans=0.0 2023-09-29 23:09:11,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:09:11,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:12,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 23:09:15,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 23:09:15,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 23:09:16,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:18,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:18,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:18,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=519333.3333333333, ans=0.0 2023-09-29 23:09:19,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:09:21,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:09:21,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:09:24,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=519333.3333333333, ans=0.05 2023-09-29 23:09:26,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:09:28,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 23:09:28,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 23:09:28,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:09:30,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:32,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:34,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:37,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 23:09:38,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:40,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:40,444 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=519400.0, ans=0.0 2023-09-29 23:09:42,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 23:09:43,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 23:09:47,182 INFO [train.py:1039] (0/4) Epoch 15, batch 3550, loss[loss=0.1786, simple_loss=0.2675, pruned_loss=0.04481, over 24440.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2594, pruned_loss=0.0567, over 4694045.41 frames. ], batch size: 69, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:09:47,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:47,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:48,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:09:48,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:09:54,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:09:56,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=519466.6666666667, ans=0.0 2023-09-29 23:10:03,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:03,884 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=519533.3333333333, ans=0.125 2023-09-29 23:10:04,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 23:10:06,904 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=519533.3333333333, ans=0.0 2023-09-29 23:10:07,552 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.34 vs. limit=6.0 2023-09-29 23:10:08,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:09,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:10:11,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:12,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:10:12,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:10:14,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=519533.3333333333, ans=0.125 2023-09-29 23:10:15,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:15,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:10:18,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:18,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:10:18,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:10:25,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:10:25,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:27,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:27,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:29,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:10:29,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 23:10:29,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:31,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:32,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:10:38,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:38,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:40,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:41,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 23:10:41,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:10:43,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 23:10:44,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:46,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:10:47,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:10:49,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 23:10:51,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:10:51,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=519733.3333333333, ans=0.125 2023-09-29 23:10:55,342 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=519733.3333333333, ans=0.125 2023-09-29 23:11:00,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:00,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 23:11:01,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:03,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:11:05,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 23:11:09,416 INFO [train.py:1039] (0/4) Epoch 15, batch 3600, loss[loss=0.1718, simple_loss=0.2469, pruned_loss=0.04833, over 24319.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2591, pruned_loss=0.05645, over 4705655.65 frames. ], batch size: 56, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:11:12,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 23:11:12,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:11:14,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:11:14,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=519800.0, ans=0.1 2023-09-29 23:11:14,987 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-09-29 23:11:15,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:11:20,435 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.241e+02 2.559e+02 3.675e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-29 23:11:20,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:22,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:23,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:11:25,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:11:25,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:25,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 23:11:30,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:11:31,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=519866.6666666667, ans=0.0 2023-09-29 23:11:33,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:35,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:38,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:38,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:11:40,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:40,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 23:11:40,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:43,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:43,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:11:43,941 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=519933.3333333333, ans=0.125 2023-09-29 23:11:45,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:48,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:49,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:11:51,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 23:11:58,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:00,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:12:00,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 23:12:00,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=520000.0, ans=0.1 2023-09-29 23:12:07,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:12:12,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=520000.0, ans=0.1 2023-09-29 23:12:13,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:16,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:22,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:12:22,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:12:22,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 23:12:24,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 23:12:26,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 23:12:27,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:12:29,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:12:30,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 23:12:30,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:12:30,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:12:30,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:32,131 INFO [train.py:1039] (0/4) Epoch 15, batch 3650, loss[loss=0.1622, simple_loss=0.2336, pruned_loss=0.04535, over 24417.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2591, pruned_loss=0.05649, over 4705414.26 frames. ], batch size: 58, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:12:32,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 23:12:33,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 23:12:37,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:39,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 23:12:43,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 23:12:44,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:12:48,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 23:12:50,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 23:12:54,253 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.04 vs. limit=12.0 2023-09-29 23:12:54,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:12:54,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:12:55,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:12:59,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:12:59,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:13:01,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 23:13:01,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:13:02,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:02,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 23:13:04,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:13:05,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:05,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:07,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:13:11,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 23:13:13,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 23:13:14,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:13:16,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 23:13:18,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:18,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:13:23,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:13:25,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:26,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:13:26,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=520333.3333333333, ans=0.125 2023-09-29 23:13:28,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:13:28,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:13:28,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=520333.3333333333, ans=0.0 2023-09-29 23:13:31,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:13:34,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:35,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:35,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:37,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:13:37,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:39,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:47,177 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 23:13:50,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:50,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:53,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:13:53,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:13:54,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:13:54,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=520466.6666666667, ans=0.125 2023-09-29 23:13:55,843 INFO [train.py:1039] (0/4) Epoch 15, batch 3700, loss[loss=0.1865, simple_loss=0.2668, pruned_loss=0.05313, over 24084.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2605, pruned_loss=0.05681, over 4706253.56 frames. ], batch size: 86, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:13:57,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:57,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 23:13:58,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:02,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:14:04,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:14:04,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:14:06,740 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.64 vs. limit=15.0 2023-09-29 23:14:07,295 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.795e+02 1.978e+02 2.320e+02 3.492e+02, threshold=3.956e+02, percent-clipped=0.0 2023-09-29 23:14:07,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:07,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 23:14:07,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:07,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=520466.6666666667, ans=0.0 2023-09-29 23:14:09,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:14:09,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:14:10,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:14:13,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:14:15,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:16,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:14:16,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:18,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:14:20,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:23,369 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 23:14:23,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=520533.3333333333, ans=0.125 2023-09-29 23:14:25,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=520533.3333333333, ans=0.125 2023-09-29 23:14:27,296 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=520600.0, ans=0.5 2023-09-29 23:14:30,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:14:30,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:14:31,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:14:32,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 23:14:32,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:37,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:39,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 23:14:39,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:40,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:14:43,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:43,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:14:46,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:14:53,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:53,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 23:14:53,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:53,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 23:14:58,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:14:58,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:14:59,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=520666.6666666667, ans=0.0 2023-09-29 23:15:02,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:04,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 23:15:05,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:15:05,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:15:05,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:05,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:09,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:11,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 23:15:12,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 23:15:13,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:15:13,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:15,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:15:17,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:15:18,591 INFO [train.py:1039] (0/4) Epoch 15, batch 3750, loss[loss=0.1623, simple_loss=0.2414, pruned_loss=0.04162, over 24497.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2621, pruned_loss=0.05756, over 4700496.81 frames. ], batch size: 63, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:15:20,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:15:20,515 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=520800.0, ans=0.125 2023-09-29 23:15:21,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:15:23,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:15:25,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 23:15:25,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:15:28,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:15:28,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 23:15:28,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:15:30,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:31,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:34,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:15:39,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:43,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:15:43,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:15:46,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:46,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=520866.6666666667, ans=0.0 2023-09-29 23:15:49,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:15:50,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 23:15:50,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:15:51,464 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.85 vs. limit=15.0 2023-09-29 23:15:52,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:15:53,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:58,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 23:16:00,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 23:16:02,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:16:02,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:16:05,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:10,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:11,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:16:16,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 23:16:19,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:22,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:16:22,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=521000.0, ans=0.1 2023-09-29 23:16:23,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:16:25,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:16:29,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:16:30,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:16:31,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=521066.6666666667, ans=0.1 2023-09-29 23:16:33,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:16:35,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:16:36,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:16:38,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=521066.6666666667, ans=0.5 2023-09-29 23:16:42,067 INFO [train.py:1039] (0/4) Epoch 15, batch 3800, loss[loss=0.1947, simple_loss=0.2558, pruned_loss=0.06684, over 23858.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2626, pruned_loss=0.05821, over 4684465.57 frames. ], batch size: 195, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:16:42,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=521133.3333333333, ans=0.1 2023-09-29 23:16:43,102 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.55 vs. limit=12.0 2023-09-29 23:16:48,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:16:52,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:53,957 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.926e+02 2.197e+02 2.572e+02 3.793e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 23:16:54,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:16:55,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 23:16:57,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:58,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:16:58,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:17:01,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:17:01,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:02,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=521200.0, ans=0.0 2023-09-29 23:17:03,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:17:05,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:17:05,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:17:05,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:06,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 23:17:07,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=521200.0, ans=0.07 2023-09-29 23:17:09,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 23:17:11,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:17:14,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:16,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:17:16,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:17:20,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:17:20,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:22,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:23,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:28,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:17:28,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 23:17:32,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:32,520 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=521333.3333333333, ans=0.0 2023-09-29 23:17:37,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:17:42,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:17:43,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 23:17:45,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 23:17:45,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:48,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:50,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:50,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=521400.0, ans=0.125 2023-09-29 23:17:52,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 23:17:56,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 23:17:56,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 23:17:56,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:58,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:18:02,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:18:03,680 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.68 vs. limit=15.0 2023-09-29 23:18:04,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:18:05,915 INFO [train.py:1039] (0/4) Epoch 15, batch 3850, loss[loss=0.1896, simple_loss=0.2519, pruned_loss=0.0637, over 23710.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2608, pruned_loss=0.05823, over 4673126.69 frames. ], batch size: 164, lr: 6.85e-03, grad_scale: 8.0 2023-09-29 23:18:08,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:18:09,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 23:18:11,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:18:11,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:13,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=521466.6666666667, ans=0.125 2023-09-29 23:18:14,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:18:17,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:20,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:18:22,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 23:18:29,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:32,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:34,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:34,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:18:37,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:38,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:18:40,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:40,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=521600.0, ans=0.0 2023-09-29 23:18:42,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:18:42,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:43,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:43,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=521600.0, ans=0.125 2023-09-29 23:18:45,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:45,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:18:46,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 23:18:46,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 23:18:46,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:46,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:50,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:50,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=521600.0, ans=0.0 2023-09-29 23:18:51,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:51,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 23:18:52,442 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.78 vs. limit=15.0 2023-09-29 23:18:53,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 23:18:56,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:57,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 23:18:58,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:19:04,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:06,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:19:09,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=521666.6666666667, ans=0.0 2023-09-29 23:19:10,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:10,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 23:19:14,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 23:19:16,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=521733.3333333333, ans=0.125 2023-09-29 23:19:17,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:17,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:22,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:19:22,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:19:22,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:23,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:23,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:19:23,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 23:19:25,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:19:25,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 23:19:26,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:27,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:28,418 INFO [train.py:1039] (0/4) Epoch 15, batch 3900, loss[loss=0.177, simple_loss=0.2648, pruned_loss=0.0446, over 24646.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2596, pruned_loss=0.05733, over 4688574.03 frames. ], batch size: 73, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:19:30,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:19:30,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:32,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:19:32,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:32,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:32,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1.whitening_limit, batch_count=521800.0, ans=10.0 2023-09-29 23:19:33,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:33,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 23:19:33,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:38,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:40,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:40,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:19:41,905 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.819e+02 2.036e+02 2.423e+02 3.835e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-29 23:19:42,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:45,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:45,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:47,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:19:49,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 23:19:49,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:19:50,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 23:19:50,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:50,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 23:19:52,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 23:19:54,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=521866.6666666667, ans=0.2 2023-09-29 23:19:57,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:19:58,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:58,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:20:00,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:05,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:20:06,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:20:09,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:20:09,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:10,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:20:18,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:18,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:20:20,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=522000.0, ans=0.125 2023-09-29 23:20:25,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:20:28,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:20:39,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:20:43,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:43,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 23:20:44,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 23:20:44,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:46,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 23:20:47,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:49,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 23:20:51,206 INFO [train.py:1039] (0/4) Epoch 15, batch 3950, loss[loss=0.1888, simple_loss=0.2686, pruned_loss=0.05451, over 23994.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2589, pruned_loss=0.05665, over 4698267.52 frames. ], batch size: 86, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:20:56,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:58,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 23:20:58,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:21:01,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:21:01,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=522133.3333333333, ans=0.0 2023-09-29 23:21:03,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:21:08,263 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 23:21:08,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:10,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 23:21:10,345 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 23:21:10,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:12,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=522200.0, ans=0.2 2023-09-29 23:21:13,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:14,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:21:14,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:16,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 23:21:19,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:21:19,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:19,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:21:21,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:21:21,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:21:33,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:21:33,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:21:41,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 23:21:47,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 23:21:47,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 23:21:47,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:21:48,317 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=522333.3333333333, ans=0.125 2023-09-29 23:21:49,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:21:54,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:21:54,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:21:56,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:56,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:21:56,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 23:22:03,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:22:04,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:22:07,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 23:22:12,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=522400.0, ans=10.0 2023-09-29 23:22:15,387 INFO [train.py:1039] (0/4) Epoch 15, batch 4000, loss[loss=0.1895, simple_loss=0.2578, pruned_loss=0.06063, over 23737.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2601, pruned_loss=0.05747, over 4700114.95 frames. ], batch size: 232, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:22:17,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:24,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:25,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=522466.6666666667, ans=0.125 2023-09-29 23:22:25,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=522466.6666666667, ans=0.125 2023-09-29 23:22:28,375 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.844e+02 2.082e+02 2.375e+02 3.458e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 23:22:28,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:30,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:22:32,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:32,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 23:22:33,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:22:35,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 23:22:35,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:22:35,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 23:22:36,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:39,392 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=522533.3333333333, ans=0.125 2023-09-29 23:22:40,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:22:40,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:22:40,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:22:42,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:22:42,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:22:44,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:22:46,988 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 23:22:47,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=522600.0, ans=0.125 2023-09-29 23:22:48,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:22:48,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:22:51,661 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 23:22:53,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:22:53,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:22:57,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=522600.0, ans=0.2 2023-09-29 23:22:59,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 23:23:01,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:23:03,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:23:04,818 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 23:23:04,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:23:06,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 23:23:06,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:23:06,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:08,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:23:11,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:23:11,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:23:11,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:23:13,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 23:23:13,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:15,720 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 23:23:21,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:23:24,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:23:25,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:23:27,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:28,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:23:29,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:23:34,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:37,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:23:37,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 23:23:38,551 INFO [train.py:1039] (0/4) Epoch 15, batch 4050, loss[loss=0.2168, simple_loss=0.27, pruned_loss=0.08177, over 23852.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2604, pruned_loss=0.05731, over 4710701.21 frames. ], batch size: 164, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:23:38,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:23:38,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:23:38,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=522800.0, ans=0.125 2023-09-29 23:23:40,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:23:42,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:23:42,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:42,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=522800.0, ans=0.0 2023-09-29 23:23:44,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=522800.0, ans=0.04949747468305833 2023-09-29 23:23:47,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:50,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:23:50,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:23:54,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:23:55,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:23:59,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=522866.6666666667, ans=15.0 2023-09-29 23:24:00,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:02,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:24:07,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 23:24:07,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=522866.6666666667, ans=0.125 2023-09-29 23:24:08,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 23:24:10,076 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 23:24:11,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:24:19,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 23:24:19,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:24,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:27,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:27,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:24:27,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:29,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=523000.0, ans=0.0 2023-09-29 23:24:32,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:24:34,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=523000.0, ans=0.125 2023-09-29 23:24:36,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 23:24:36,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:24:38,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:40,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 23:24:40,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=523000.0, ans=0.125 2023-09-29 23:24:44,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:48,036 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=523066.6666666667, ans=0.125 2023-09-29 23:24:50,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 23:24:53,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:53,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:24:55,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 23:24:55,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 23:24:55,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:24:57,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:24:59,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:24:59,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:25:00,920 INFO [train.py:1039] (0/4) Epoch 15, batch 4100, loss[loss=0.1844, simple_loss=0.2672, pruned_loss=0.05084, over 24471.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2617, pruned_loss=0.05719, over 4711248.49 frames. ], batch size: 66, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:25:06,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 23:25:08,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 23:25:10,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 23:25:12,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 23:25:12,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:13,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:25:15,164 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 23:25:16,454 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.960e+02 2.243e+02 2.866e+02 4.978e+02, threshold=4.486e+02, percent-clipped=4.0 2023-09-29 23:25:18,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:18,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:25:18,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:19,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:25:22,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:25:24,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:24,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:25:24,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 23:25:24,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:25,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:25:25,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:25,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:25:26,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 23:25:29,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:31,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 23:25:33,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:25:35,738 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.22 vs. limit=15.0 2023-09-29 23:25:36,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:36,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 23:25:38,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:25:40,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:25:40,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:25:41,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 23:25:43,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:25:43,508 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:25:47,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 23:25:47,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=523266.6666666667, ans=0.2 2023-09-29 23:25:48,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:48,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:25:51,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:52,801 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.88 vs. limit=15.0 2023-09-29 23:25:58,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:01,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:02,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:26:05,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=523333.3333333333, ans=0.2 2023-09-29 23:26:13,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:13,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:26:13,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=523400.0, ans=0.0 2023-09-29 23:26:17,577 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.61 vs. limit=15.0 2023-09-29 23:26:18,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:19,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:26:23,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:26:25,118 INFO [train.py:1039] (0/4) Epoch 15, batch 4150, loss[loss=0.2119, simple_loss=0.268, pruned_loss=0.07788, over 19933.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2618, pruned_loss=0.05766, over 4714206.00 frames. ], batch size: 388, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:26:26,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:26:26,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:26:26,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:29,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 23:26:29,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:31,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 23:26:31,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 23:26:32,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 23:26:33,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=523466.6666666667, ans=0.125 2023-09-29 23:26:34,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:37,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=523466.6666666667, ans=0.1 2023-09-29 23:26:40,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:26:40,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:44,575 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=523533.3333333333, ans=0.125 2023-09-29 23:26:45,069 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.75 vs. limit=15.0 2023-09-29 23:26:45,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:26:47,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:26:47,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:26:50,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:26:50,712 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=523533.3333333333, ans=0.2 2023-09-29 23:26:51,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:53,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:26:57,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:01,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:03,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 23:27:05,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 23:27:05,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:27:07,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 23:27:07,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:27:07,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:08,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:10,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:13,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 23:27:15,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:19,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:27:19,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 23:27:19,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:20,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 23:27:23,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:27:26,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:27,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:29,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 23:27:29,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:27:29,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:27:29,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=523733.3333333333, ans=0.125 2023-09-29 23:27:30,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:27:32,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 23:27:34,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:34,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:27:34,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:27:34,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 23:27:34,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=523733.3333333333, ans=0.1 2023-09-29 23:27:35,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:35,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:27:36,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:39,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:39,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 23:27:40,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:45,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=523800.0, ans=0.125 2023-09-29 23:27:46,948 INFO [train.py:1039] (0/4) Epoch 15, batch 4200, loss[loss=0.1849, simple_loss=0.27, pruned_loss=0.0499, over 24274.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2606, pruned_loss=0.05744, over 4713655.50 frames. ], batch size: 74, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:27:47,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:27:49,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 23:27:50,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=523800.0, ans=0.2 2023-09-29 23:27:52,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:27:52,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=523800.0, ans=0.09899494936611666 2023-09-29 23:27:54,347 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.52 vs. limit=22.5 2023-09-29 23:27:55,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:27:55,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:27:56,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:56,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:58,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 23:28:01,693 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.915e+02 2.061e+02 2.276e+02 4.406e+02, threshold=4.122e+02, percent-clipped=0.0 2023-09-29 23:28:02,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 23:28:02,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:05,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:08,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:28:09,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:28:11,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:11,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:13,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 23:28:13,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:14,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:14,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:28:15,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:28:16,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:28:16,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=523866.6666666667, ans=0.0 2023-09-29 23:28:21,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 23:28:21,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:26,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:28:28,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:28:28,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=523933.3333333333, ans=0.125 2023-09-29 23:28:29,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:28:31,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:28:31,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=523933.3333333333, ans=0.125 2023-09-29 23:28:33,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:28:33,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 23:28:33,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:35,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:28:39,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=524000.0, ans=0.1 2023-09-29 23:28:40,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:28:43,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:47,073 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=524000.0, ans=0.125 2023-09-29 23:28:49,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:28:52,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 23:28:54,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:58,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=524066.6666666667, ans=0.02 2023-09-29 23:28:59,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:29:01,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:02,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 23:29:06,701 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.67 vs. limit=22.5 2023-09-29 23:29:08,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:29:09,499 INFO [train.py:1039] (0/4) Epoch 15, batch 4250, loss[loss=0.179, simple_loss=0.2433, pruned_loss=0.05741, over 23533.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2587, pruned_loss=0.05692, over 4703585.62 frames. ], batch size: 285, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:29:12,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:29:12,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:29:12,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=524133.3333333333, ans=0.0 2023-09-29 23:29:15,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:20,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:29:20,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 23:29:22,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:29:24,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:25,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=524200.0, ans=0.125 2023-09-29 23:29:27,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:34,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:34,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:34,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:29:34,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:29:36,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:37,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:39,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:42,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:29:44,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:29:45,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 23:29:48,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 23:29:48,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:49,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:49,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:50,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:29:50,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:52,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:53,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=524266.6666666667, ans=0.0 2023-09-29 23:29:55,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:29:57,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:29:57,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=524333.3333333334, ans=0.2 2023-09-29 23:30:02,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:04,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:06,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 23:30:06,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:30:06,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 23:30:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:30:09,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:30:10,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:10,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:30:12,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 23:30:14,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:30:15,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:30:20,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:20,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=524400.0, ans=0.0 2023-09-29 23:30:23,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:25,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:30:27,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:28,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:30,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:30:30,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:30:30,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 23:30:31,759 INFO [train.py:1039] (0/4) Epoch 15, batch 4300, loss[loss=0.2053, simple_loss=0.2845, pruned_loss=0.06308, over 24544.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2588, pruned_loss=0.05701, over 4709955.18 frames. ], batch size: 71, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:30:32,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:32,738 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.07 vs. limit=22.5 2023-09-29 23:30:36,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:38,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:30:41,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:41,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=524466.6666666666, ans=0.1 2023-09-29 23:30:47,065 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.892e+02 2.099e+02 2.369e+02 3.970e+02, threshold=4.198e+02, percent-clipped=0.0 2023-09-29 23:30:47,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=524533.3333333334, ans=0.125 2023-09-29 23:30:50,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:50,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 23:30:51,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:30:53,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:30:55,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:30:55,321 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 23:30:58,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:30:59,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:05,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 23:31:05,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:31:05,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 23:31:08,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:31:10,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:31:14,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:31:14,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:31:14,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:31:15,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:16,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:31:16,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 23:31:19,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 23:31:20,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:31:22,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:22,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:31:22,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:23,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:23,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 23:31:23,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 23:31:23,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 23:31:26,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:31:26,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 23:31:27,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 23:31:32,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:33,653 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 23:31:35,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:31:37,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:37,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:39,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 23:31:39,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:39,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:41,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:31:42,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:42,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:31:45,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:31:47,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:48,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:49,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:55,415 INFO [train.py:1039] (0/4) Epoch 15, batch 4350, loss[loss=0.2331, simple_loss=0.294, pruned_loss=0.08604, over 19527.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2597, pruned_loss=0.05706, over 4717831.39 frames. ], batch size: 388, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:31:55,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 23:31:55,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:32:03,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:06,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:08,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=524800.0, ans=0.125 2023-09-29 23:32:09,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:32:09,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:32:13,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:32:18,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:20,433 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=524866.6666666666, ans=0.0 2023-09-29 23:32:21,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:32:21,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:32:25,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:32:25,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=524866.6666666666, ans=0.125 2023-09-29 23:32:27,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:32:28,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:32:30,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=524933.3333333334, ans=0.125 2023-09-29 23:32:35,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 23:32:36,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:37,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:40,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:41,092 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=524933.3333333334, ans=0.125 2023-09-29 23:32:41,574 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.18 vs. limit=6.0 2023-09-29 23:32:44,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 23:32:48,416 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.34 vs. limit=22.5 2023-09-29 23:32:49,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:32:50,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=525000.0, ans=0.125 2023-09-29 23:32:52,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:32:57,601 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 23:32:59,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:32:59,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:32:59,886 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 23:33:01,334 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 23:33:01,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:01,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:02,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:33:02,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:04,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:05,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:07,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 23:33:08,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:08,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:08,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:10,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 23:33:11,950 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 23:33:11,957 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 23:33:11,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 23:33:15,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:33:16,460 INFO [train.py:1039] (0/4) Epoch 15, batch 4400, loss[loss=0.1788, simple_loss=0.2598, pruned_loss=0.04892, over 24504.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2596, pruned_loss=0.05619, over 4736480.94 frames. ], batch size: 63, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:33:16,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:33:16,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:17,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:33:19,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 23:33:21,185 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 23:33:21,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:27,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:27,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:29,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:30,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 23:33:30,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 23:33:32,656 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.997e+02 2.183e+02 2.511e+02 3.955e+02, threshold=4.366e+02, percent-clipped=0.0 2023-09-29 23:33:32,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 23:33:32,828 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 23:33:34,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:33:34,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:35,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 23:33:38,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:39,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=525200.0, ans=0.125 2023-09-29 23:33:40,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:40,559 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 23:33:43,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:43,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 23:33:43,794 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 23:33:46,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 23:33:47,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 23:33:47,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 23:33:48,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:48,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:53,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 23:33:53,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 23:33:54,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:56,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:33:56,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:58,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:58,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:58,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 23:34:00,505 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 23:34:02,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:04,848 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=525266.6666666666, ans=0.2 2023-09-29 23:34:11,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:34:12,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 23:34:16,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:34:17,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:20,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:34:20,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 23:34:20,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:34:20,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:34:20,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:34:22,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:34:26,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 23:34:30,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 23:34:32,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 23:34:32,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:34:32,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 23:34:32,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:34:36,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:34:39,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=525400.0, ans=0.0 2023-09-29 23:34:40,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 23:34:41,762 INFO [train.py:1039] (0/4) Epoch 15, batch 4450, loss[loss=0.1651, simple_loss=0.2506, pruned_loss=0.03985, over 24490.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2602, pruned_loss=0.05626, over 4723583.46 frames. ], batch size: 63, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:34:43,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:45,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:45,352 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:34:46,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:34:52,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:34:52,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:34:55,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:58,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:35:01,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:35:01,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:02,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 23:35:02,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:03,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:04,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:04,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:35:08,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:35:13,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:14,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:15,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=525600.0, ans=0.0 2023-09-29 23:35:16,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:18,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:18,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:35:22,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:35:24,392 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=525600.0, ans=0.0 2023-09-29 23:35:25,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 23:35:25,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 23:35:25,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:35:28,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:30,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 23:35:34,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:35:38,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:39,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 23:35:39,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:39,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:39,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:35:39,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:42,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:45,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:35:45,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 23:35:47,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:35:49,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:51,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:53,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:53,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:35:58,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:36:01,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 23:36:02,720 INFO [train.py:1039] (0/4) Epoch 15, batch 4500, loss[loss=0.1744, simple_loss=0.2523, pruned_loss=0.04822, over 24647.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2607, pruned_loss=0.05664, over 4716160.21 frames. ], batch size: 65, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:36:02,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:36:07,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:08,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 23:36:08,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 23:36:10,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=525800.0, ans=0.0 2023-09-29 23:36:11,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:16,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:36:16,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:17,840 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.874e+02 2.112e+02 2.381e+02 3.744e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:36:17,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:36:18,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:36:19,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:19,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:30,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:32,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:36:35,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:36:35,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:36:37,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:36:43,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:36:47,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:36:51,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:36:56,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:36:56,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 23:36:57,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:36:57,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:36:59,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:00,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:37:02,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:37:02,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 23:37:02,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:37:02,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:05,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:37:05,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:37:09,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:12,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:37:12,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:37:13,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 23:37:16,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 23:37:16,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 23:37:19,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 23:37:20,966 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=526066.6666666666, ans=0.125 2023-09-29 23:37:23,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 23:37:24,400 INFO [train.py:1039] (0/4) Epoch 15, batch 4550, loss[loss=0.1974, simple_loss=0.2605, pruned_loss=0.06712, over 23717.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2598, pruned_loss=0.05699, over 4713957.83 frames. ], batch size: 212, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:37:24,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:29,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:29,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:31,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:37,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:37:39,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:40,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:37:40,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:37:40,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:43,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:43,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:46,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:37:50,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 23:37:50,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 23:37:51,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:37:53,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 23:37:58,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 23:37:59,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:37:59,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=526266.6666666666, ans=0.0 2023-09-29 23:38:04,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 23:38:07,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:38:08,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:08,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:10,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:38:11,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 23:38:15,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:16,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:18,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:38:18,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:21,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 23:38:21,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 23:38:21,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:38:22,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 23:38:25,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 23:38:25,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:27,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:27,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:29,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:29,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:38:32,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:38:32,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 23:38:34,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:34,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:38:36,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 23:38:36,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:38:36,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 23:38:39,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:38:39,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:38:42,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:38:42,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:42,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:38:45,637 INFO [train.py:1039] (0/4) Epoch 15, batch 4600, loss[loss=0.1763, simple_loss=0.2402, pruned_loss=0.05622, over 23697.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2584, pruned_loss=0.05707, over 4704547.34 frames. ], batch size: 232, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:38:45,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:38:47,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:38:50,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:51,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:54,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:38:55,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:38:55,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:38:56,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 23:38:58,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:38:58,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=526466.6666666666, ans=0.0 2023-09-29 23:38:59,850 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.932e+02 2.167e+02 2.436e+02 3.970e+02, threshold=4.334e+02, percent-clipped=0.0 2023-09-29 23:39:02,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:39:04,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:06,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:12,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 23:39:14,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:16,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:20,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:39:20,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:23,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 23:39:23,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:39:25,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:39:25,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=526600.0, ans=0.1 2023-09-29 23:39:27,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=526600.0, ans=0.125 2023-09-29 23:39:31,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:34,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:39:34,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:39:38,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 23:39:39,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:39:45,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:45,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:39:48,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:48,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 23:39:49,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:49,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 23:39:49,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:51,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:51,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:52,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:52,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:53,190 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=526733.3333333334, ans=0.0 2023-09-29 23:39:54,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 23:39:54,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 23:39:54,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 23:39:54,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:56,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:39:57,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:57,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:40:09,565 INFO [train.py:1039] (0/4) Epoch 15, batch 4650, loss[loss=0.1768, simple_loss=0.2494, pruned_loss=0.05213, over 24335.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2587, pruned_loss=0.05702, over 4699314.78 frames. ], batch size: 61, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:40:11,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:40:12,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:15,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:15,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:40:15,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:40:15,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:16,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:19,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 23:40:23,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:40:24,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 23:40:24,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:26,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 23:40:26,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:40:27,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 23:40:27,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 23:40:27,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:29,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:40:32,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:40:33,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:33,751 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 23:40:36,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:36,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 23:40:40,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:40,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:40:42,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 23:40:45,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:40:47,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:40:52,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:57,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:00,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:41:01,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=527000.0, ans=0.2 2023-09-29 23:41:03,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 23:41:03,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=527000.0, ans=0.0 2023-09-29 23:41:04,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 23:41:06,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 23:41:06,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 23:41:07,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:15,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:41:15,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:15,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 23:41:16,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:18,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:18,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:41:19,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=527066.6666666666, ans=0.1 2023-09-29 23:41:21,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:41:23,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:41:23,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:23,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=527066.6666666666, ans=0.1 2023-09-29 23:41:25,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:28,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:28,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:41:30,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:41:30,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 23:41:30,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:41:32,028 INFO [train.py:1039] (0/4) Epoch 15, batch 4700, loss[loss=0.2077, simple_loss=0.2762, pruned_loss=0.06961, over 23765.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2599, pruned_loss=0.05709, over 4718614.75 frames. ], batch size: 164, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:41:33,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 23:41:34,638 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=8.73 vs. limit=15.0 2023-09-29 23:41:41,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:42,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:43,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:41:43,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=527133.3333333334, ans=15.0 2023-09-29 23:41:44,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:44,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:41:48,315 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.872e+02 2.063e+02 2.349e+02 3.516e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-29 23:41:50,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 23:41:51,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 23:41:53,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:55,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:41:55,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:58,811 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.22 vs. limit=10.0 2023-09-29 23:41:59,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:06,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:42:07,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:42:09,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:42:15,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 23:42:15,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:42:18,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:22,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 23:42:23,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:42:28,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:42:30,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 23:42:32,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:32,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:34,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:36,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:42:36,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 23:42:37,679 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 23:42:39,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:40,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 23:42:41,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:41,399 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=527400.0, ans=0.125 2023-09-29 23:42:44,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 23:42:46,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=527400.0, ans=0.09899494936611666 2023-09-29 23:42:47,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:42:48,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,255 INFO [train.py:1039] (0/4) Epoch 15, batch 4750, loss[loss=0.1971, simple_loss=0.2671, pruned_loss=0.06355, over 23751.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2605, pruned_loss=0.05718, over 4710490.97 frames. ], batch size: 179, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:42:53,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:42:55,641 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.24 vs. limit=15.0 2023-09-29 23:42:57,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 23:42:58,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:42:58,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=527466.6666666666, ans=0.125 2023-09-29 23:43:03,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 23:43:06,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:43:07,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:43:08,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:10,456 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:43:14,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 23:43:18,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:43:21,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 23:43:22,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:24,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:24,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:25,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:25,937 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 23:43:25,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 23:43:33,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 23:43:36,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:38,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:43:42,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:43:42,117 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 23:43:42,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:43:45,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:43:48,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:43:50,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 23:43:50,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 23:43:50,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:52,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:43:52,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:53,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:43:53,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 23:43:56,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 23:43:59,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:43:59,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=527733.3333333334, ans=0.125 2023-09-29 23:44:02,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:44:02,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 23:44:02,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:04,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:05,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:44:07,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:07,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:44:12,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:12,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 23:44:14,010 INFO [train.py:1039] (0/4) Epoch 15, batch 4800, loss[loss=0.197, simple_loss=0.2735, pruned_loss=0.06029, over 24087.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2609, pruned_loss=0.05734, over 4719821.30 frames. ], batch size: 86, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:44:14,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 23:44:15,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 23:44:18,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:44:19,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:19,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 23:44:27,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:27,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:30,099 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.901e+02 2.248e+02 2.763e+02 5.522e+02, threshold=4.496e+02, percent-clipped=3.0 2023-09-29 23:44:31,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:44:33,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:33,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:34,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 23:44:34,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:34,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:44:37,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:44:41,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:44:42,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:44:44,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:44:44,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:45,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:49,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:53,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:54,025 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=527933.3333333334, ans=0.0 2023-09-29 23:44:55,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:55,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:44:56,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:44:58,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:58,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=527933.3333333334, ans=0.125 2023-09-29 23:45:01,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 23:45:01,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 23:45:03,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:03,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:45:03,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:45:03,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:03,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:45:06,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:45:06,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:09,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:11,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:12,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:17,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 23:45:17,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:17,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:18,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:45:18,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:24,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:24,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:45:24,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:26,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:45:27,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:45:28,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:45:29,410 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.20 vs. limit=15.0 2023-09-29 23:45:32,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:32,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:32,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:36,022 INFO [train.py:1039] (0/4) Epoch 15, batch 4850, loss[loss=0.1814, simple_loss=0.2616, pruned_loss=0.05056, over 24308.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2609, pruned_loss=0.05676, over 4721542.74 frames. ], batch size: 74, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:45:36,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 23:45:37,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 23:45:37,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:37,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:40,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:45:40,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:42,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:48,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 23:45:52,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:57,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:45:58,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:45:58,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:02,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:46:02,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:46:04,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:46:04,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 23:46:09,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:46:11,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:46:11,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:46:12,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:46:12,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 23:46:15,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:46:15,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:18,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:19,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 23:46:19,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 23:46:20,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:46:28,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:46:29,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 23:46:31,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:46:31,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:46:31,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=528333.3333333334, ans=0.0 2023-09-29 23:46:32,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:46:35,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 23:46:35,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:35,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 23:46:35,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:37,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:46:38,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 23:46:48,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:52,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=528400.0, ans=0.0 2023-09-29 23:46:54,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:46:54,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:46:58,235 INFO [train.py:1039] (0/4) Epoch 15, batch 4900, loss[loss=0.1803, simple_loss=0.2331, pruned_loss=0.06372, over 23337.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2605, pruned_loss=0.05618, over 4729665.24 frames. ], batch size: 285, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:46:58,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 23:46:58,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:47:05,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:07,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:07,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:47:07,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=528466.6666666666, ans=0.125 2023-09-29 23:47:11,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 23:47:14,801 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.963e+02 2.199e+02 2.506e+02 3.437e+02, threshold=4.398e+02, percent-clipped=0.0 2023-09-29 23:47:16,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 23:47:21,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 23:47:23,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 23:47:23,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:23,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:23,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:47:23,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:23,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:47:24,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 23:47:25,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=528533.3333333334, ans=0.125 2023-09-29 23:47:28,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 23:47:28,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:47:30,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:47:30,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:33,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:47:33,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:33,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:33,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 23:47:37,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:47:39,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:39,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 23:47:39,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 23:47:44,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 23:47:47,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:47:47,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:47:47,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:47:47,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:47,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:47:48,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=528666.6666666666, ans=0.125 2023-09-29 23:47:49,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:47:49,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 23:47:49,651 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=528666.6666666666, ans=0.0 2023-09-29 23:47:51,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:51,903 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.89 vs. limit=15.0 2023-09-29 23:47:54,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:47:55,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:48:00,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 23:48:02,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:48:02,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 23:48:03,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 23:48:10,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:10,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=528733.3333333334, ans=0.125 2023-09-29 23:48:12,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:14,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 23:48:14,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:14,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:48:16,641 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.78 vs. limit=15.0 2023-09-29 23:48:17,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:20,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:20,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:48:20,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:20,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:48:21,902 INFO [train.py:1039] (0/4) Epoch 15, batch 4950, loss[loss=0.1689, simple_loss=0.2521, pruned_loss=0.04285, over 24477.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2593, pruned_loss=0.05575, over 4722730.07 frames. ], batch size: 63, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:48:22,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:48:25,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:25,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:27,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=528800.0, ans=0.015 2023-09-29 23:48:28,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 23:48:28,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 23:48:30,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:48:30,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 23:48:31,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:31,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:48:31,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:48:31,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:32,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=528800.0, ans=0.0 2023-09-29 23:48:35,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:35,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:48:37,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:48:38,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:41,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:41,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:46,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:48:52,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:53,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:55,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:55,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:56,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:48:57,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 23:48:58,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 23:48:59,240 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.05 vs. limit=15.0 2023-09-29 23:49:01,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:03,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:49:03,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:49:05,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:05,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:49:07,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:49:08,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:11,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:49:14,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:49:14,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:16,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:16,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 23:49:18,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:49:19,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:49:23,874 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=529000.0, ans=0.125 2023-09-29 23:49:25,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:49:26,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:49:26,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:49:26,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:28,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:49:28,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:49:31,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:49:31,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:49:31,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:32,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 23:49:34,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:49:41,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 23:49:41,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:49:43,828 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-09-29 23:49:44,492 INFO [train.py:1039] (0/4) Epoch 15, batch 5000, loss[loss=0.167, simple_loss=0.2399, pruned_loss=0.04706, over 24565.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2586, pruned_loss=0.05492, over 4715117.93 frames. ], batch size: 60, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:49:48,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:48,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:49:48,623 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=529133.3333333334, ans=0.0 2023-09-29 23:49:49,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 23:49:51,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 23:49:51,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:49:55,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 23:49:55,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:55,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:49:57,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 23:49:58,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:58,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:49:59,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 23:49:59,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:01,015 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.874e+02 2.133e+02 2.483e+02 3.662e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-29 23:50:01,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:01,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=529200.0, ans=0.125 2023-09-29 23:50:02,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 23:50:02,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 23:50:04,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:50:04,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 23:50:04,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:50:04,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:05,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:50:05,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 23:50:05,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 23:50:07,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 23:50:07,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:07,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:09,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 23:50:09,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:50:13,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:14,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:16,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 23:50:16,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=529266.6666666666, ans=0.125 2023-09-29 23:50:17,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 23:50:17,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:50:20,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=529266.6666666666, ans=0.0 2023-09-29 23:50:20,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=529266.6666666666, ans=0.125 2023-09-29 23:50:21,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:50:26,017 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 23:50:29,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:50:31,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:31,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:50:33,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 23:50:33,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:33,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:34,627 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.17 vs. limit=15.0 2023-09-29 23:50:34,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:50:36,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:50:38,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:38,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=529333.3333333334, ans=0.1 2023-09-29 23:50:42,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:42,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:50:42,691 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:50:43,160 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=12.0 2023-09-29 23:50:47,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 23:50:51,667 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=529400.0, ans=0.125 2023-09-29 23:50:54,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:02,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:03,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:03,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:51:03,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:03,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:51:03,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:51:04,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=529400.0, ans=0.0 2023-09-29 23:51:05,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:07,339 INFO [train.py:1039] (0/4) Epoch 15, batch 5050, loss[loss=0.2145, simple_loss=0.2732, pruned_loss=0.07788, over 22840.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2591, pruned_loss=0.0558, over 4712630.12 frames. ], batch size: 322, lr: 6.80e-03, grad_scale: 8.0 2023-09-29 23:51:11,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:11,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 23:51:12,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:51:16,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:17,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:51:17,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 23:51:19,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:19,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:51:22,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:51:24,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:51:24,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:51:34,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 23:51:34,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:51:34,868 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=529533.3333333334, ans=0.0 2023-09-29 23:51:36,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:36,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 23:51:36,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:51:37,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:39,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:39,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:51:39,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 23:51:40,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 23:51:40,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:44,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:51:47,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:47,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 23:51:50,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:51:52,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 23:51:56,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:51:56,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:51:56,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:56,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:59,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:52:00,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:52:02,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:02,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:52:02,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:52:02,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 23:52:04,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:52:06,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:52:10,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:52:10,860 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 23:52:10,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:52:11,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:12,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:12,542 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 23:52:14,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:14,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 23:52:14,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:19,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:19,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:19,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 23:52:20,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=529733.3333333334, ans=0.2 2023-09-29 23:52:21,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 23:52:24,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:24,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:24,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:52:28,531 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 23:52:29,667 INFO [train.py:1039] (0/4) Epoch 15, batch 5100, loss[loss=0.2019, simple_loss=0.2647, pruned_loss=0.06951, over 22859.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.261, pruned_loss=0.05697, over 4704241.39 frames. ], batch size: 322, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:52:32,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:35,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 23:52:35,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 23:52:38,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:39,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:52:41,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:42,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 23:52:42,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 23:52:44,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=529866.6666666666, ans=0.125 2023-09-29 23:52:47,403 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.887e+02 2.076e+02 2.309e+02 3.546e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-29 23:52:47,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:47,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:52:49,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=529866.6666666666, ans=0.2 2023-09-29 23:52:52,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:54,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 23:52:55,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:57,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:57,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:52:57,763 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:53:00,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:00,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:02,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 23:53:06,360 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 23:53:06,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=529933.3333333334, ans=0.0 2023-09-29 23:53:07,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:07,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 23:53:08,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 23:53:13,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:53:15,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=529933.3333333334, ans=0.0 2023-09-29 23:53:23,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:53:26,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 23:53:27,494 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 23:53:27,509 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 23:53:29,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 23:53:29,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:32,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 23:53:35,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 23:53:37,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:53:39,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:53:40,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 23:53:42,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:53:42,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 23:53:48,123 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=12.0 2023-09-29 23:53:49,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:53:49,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:53:49,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:53:50,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:53:50,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:53:52,344 INFO [train.py:1039] (0/4) Epoch 15, batch 5150, loss[loss=0.1632, simple_loss=0.2452, pruned_loss=0.04057, over 24490.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2612, pruned_loss=0.0567, over 4714217.63 frames. ], batch size: 63, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:53:52,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:53:53,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 23:53:53,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 23:53:55,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 23:53:55,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:53:55,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 23:53:58,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:00,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 23:54:01,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:03,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:06,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:54:06,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 23:54:08,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:09,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:54:11,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:54:11,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:11,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:12,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:54:12,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:54:13,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 23:54:14,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:54:16,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:18,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:54:19,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 23:54:20,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:54:20,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=530200.0, ans=0.125 2023-09-29 23:54:27,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:54:27,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 23:54:31,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:37,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:37,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:42,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=530333.3333333334, ans=0.0 2023-09-29 23:54:43,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:43,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:45,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 23:54:50,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:51,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:54:51,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:55,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:56,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:57,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=530400.0, ans=0.07 2023-09-29 23:54:58,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 23:55:05,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:07,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:55:10,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:55:10,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:55:10,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=530400.0, ans=0.0 2023-09-29 23:55:11,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:55:11,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:55:11,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:55:11,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=530400.0, ans=0.2 2023-09-29 23:55:13,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:55:14,506 INFO [train.py:1039] (0/4) Epoch 15, batch 5200, loss[loss=0.186, simple_loss=0.2608, pruned_loss=0.05563, over 24675.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.263, pruned_loss=0.05788, over 4702463.06 frames. ], batch size: 65, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:55:16,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:55:17,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:55:19,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:24,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 23:55:26,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:55:27,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:31,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:31,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:55:32,511 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.846e+02 2.066e+02 2.366e+02 4.637e+02, threshold=4.132e+02, percent-clipped=1.0 2023-09-29 23:55:32,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:34,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 23:55:36,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:55:36,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:38,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 23:55:42,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:55:44,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:55:44,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 23:55:44,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 23:55:47,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 23:55:49,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:49,494 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 23:55:49,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:51,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:51,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:55:52,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 23:55:53,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:55:55,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:59,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 23:55:59,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 23:55:59,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 23:56:05,308 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.28 vs. limit=10.0 2023-09-29 23:56:07,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 23:56:09,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:56:14,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:56:14,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:15,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 23:56:15,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:56:16,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:56:16,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:17,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:56:21,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:21,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=530733.3333333334, ans=0.05 2023-09-29 23:56:22,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:56:23,221 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.11 vs. limit=12.0 2023-09-29 23:56:25,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:56:27,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:30,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:32,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 23:56:33,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:33,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:56:35,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:36,897 INFO [train.py:1039] (0/4) Epoch 15, batch 5250, loss[loss=0.1962, simple_loss=0.2394, pruned_loss=0.0765, over 19360.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.262, pruned_loss=0.05758, over 4703454.60 frames. ], batch size: 390, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:56:36,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:56:37,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:56:42,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:56:42,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=530800.0, ans=0.1 2023-09-29 23:56:44,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:44,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=530800.0, ans=0.125 2023-09-29 23:56:45,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:56:47,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:56:52,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:55,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:56:57,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:56:57,838 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.27 vs. limit=15.0 2023-09-29 23:56:58,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:57:01,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 23:57:01,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:57:03,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:57:10,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=530933.3333333334, ans=0.0 2023-09-29 23:57:10,978 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:57:13,609 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=530933.3333333334, ans=0.125 2023-09-29 23:57:19,249 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=530933.3333333334, ans=0.0 2023-09-29 23:57:22,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=531000.0, ans=0.2 2023-09-29 23:57:26,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=531000.0, ans=0.125 2023-09-29 23:57:35,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=531000.0, ans=0.0 2023-09-29 23:57:52,127 INFO [train.py:1039] (0/4) Epoch 15, batch 5300, loss[loss=0.1791, simple_loss=0.2381, pruned_loss=0.06001, over 23415.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2597, pruned_loss=0.05704, over 4697940.83 frames. ], batch size: 285, lr: 6.78e-03, grad_scale: 16.0 2023-09-29 23:57:57,007 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.61 vs. limit=15.0 2023-09-29 23:57:59,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=531133.3333333334, ans=0.125 2023-09-29 23:58:00,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=531133.3333333334, ans=0.1 2023-09-29 23:58:06,814 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.927e+02 2.161e+02 2.637e+02 4.366e+02, threshold=4.323e+02, percent-clipped=1.0 2023-09-29 23:58:06,869 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-15.pt 2023-09-29 23:58:12,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:58:12,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 23:58:12,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 23:58:12,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:12,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:13,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:13,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:13,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:13,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:13,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:13,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:58:14,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:58:14,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 23:58:14,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 23:58:14,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 23:58:14,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:58:14,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 23:58:14,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 23:58:14,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:15,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:15,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:15,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:15,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:58:16,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:16,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:16,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:16,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:16,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:16,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:58:16,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:16,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:58:17,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 23:58:17,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:18,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:18,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 23:58:18,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 23:58:18,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:58:18,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:18,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 23:58:19,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 23:58:19,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:19,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:58:19,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:20,098 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 23:58:20,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 23:58:20,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:58:20,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:20,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 23:58:21,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 23:58:21,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 23:58:21,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:24,968 INFO [train.py:1039] (0/4) Epoch 16, batch 0, loss[loss=0.1949, simple_loss=0.262, pruned_loss=0.06389, over 23851.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.262, pruned_loss=0.06389, over 23851.00 frames. ], batch size: 195, lr: 6.57e-03, grad_scale: 32.0 2023-09-29 23:58:24,969 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-29 23:58:41,234 INFO [train.py:1071] (0/4) Epoch 16, validation: loss=0.3148, simple_loss=0.2815, pruned_loss=0.174, over 1125622.00 frames. 2023-09-29 23:58:41,234 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-29 23:58:41,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 23:58:41,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=531213.3333333334, ans=0.1 2023-09-29 23:58:42,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:58:44,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:58:46,745 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.07 vs. limit=15.0 2023-09-29 23:58:50,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=531213.3333333334, ans=0.125 2023-09-29 23:58:51,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:53,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:58:53,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:54,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 23:58:57,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 23:58:57,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:59,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:59:02,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:59:02,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:02,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:59:04,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:04,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 23:59:08,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:15,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:59:15,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:17,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 23:59:18,560 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.40 vs. limit=15.0 2023-09-29 23:59:21,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:59:21,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:59:23,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:27,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:59:32,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:38,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 23:59:41,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 23:59:42,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=531413.3333333334, ans=0.1 2023-09-29 23:59:44,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:59:44,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:44,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:59:45,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:47,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 23:59:50,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:50,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:55,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:59:59,027 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 23:59:59,353 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=531480.0, ans=0.0 2023-09-30 00:00:01,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:00:03,411 INFO [train.py:1039] (0/4) Epoch 16, batch 50, loss[loss=0.1817, simple_loss=0.2575, pruned_loss=0.05293, over 22035.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2606, pruned_loss=0.05857, over 1058007.95 frames. ], batch size: 48, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:00:03,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:06,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:06,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 00:00:07,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=531546.6666666666, ans=0.0 2023-09-30 00:00:08,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:00:09,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:00:09,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=531546.6666666666, ans=0.125 2023-09-30 00:00:11,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:14,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:15,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:19,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 00:00:19,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:25,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:00:27,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 00:00:29,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 00:00:31,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:00:34,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:00:34,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:34,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:00:36,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:00:37,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:00:37,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:44,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:00:47,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:00:47,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:00:48,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 00:00:50,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:00:50,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=531746.6666666666, ans=0.2 2023-09-30 00:00:51,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:00:51,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 00:00:53,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:55,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 00:01:03,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:03,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:01:04,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:08,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:08,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:13,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 00:01:13,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 00:01:14,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:14,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:17,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:01:17,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:01:19,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 00:01:19,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 00:01:22,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 00:01:23,582 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.948e+02 2.203e+02 2.562e+02 3.872e+02, threshold=4.407e+02, percent-clipped=0.0 2023-09-30 00:01:23,625 INFO [train.py:1039] (0/4) Epoch 16, batch 100, loss[loss=0.2156, simple_loss=0.2932, pruned_loss=0.06905, over 23591.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2625, pruned_loss=0.05735, over 1884093.33 frames. ], batch size: 85, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:01:23,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:23,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:01:25,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 00:01:25,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 00:01:25,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:27,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:28,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:01:28,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:01:33,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:01:35,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:01:38,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:38,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 00:01:38,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:43,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:01:43,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:43,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:43,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:45,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:45,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 00:01:49,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:01:49,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:49,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:49,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:50,452 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.04 vs. limit=15.0 2023-09-30 00:01:53,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 00:01:54,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:56,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:57,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:01:59,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:02:02,141 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 00:02:02,165 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 00:02:03,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:03,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:02:08,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:02:10,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:02:12,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:18,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:18,323 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 00:02:22,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:02:25,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:02:27,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:02:28,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:33,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:34,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:36,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:02:40,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:40,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:41,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:41,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:02:41,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:43,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 00:02:43,346 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 00:02:43,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:43,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:02:43,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:43,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:43,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 00:02:45,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:02:45,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:02:45,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:46,471 INFO [train.py:1039] (0/4) Epoch 16, batch 150, loss[loss=0.1983, simple_loss=0.2603, pruned_loss=0.0681, over 23851.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2623, pruned_loss=0.05752, over 2514943.15 frames. ], batch size: 179, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:02:47,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:48,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:48,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:02:49,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:02:50,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=532213.3333333334, ans=0.0 2023-09-30 00:02:51,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:54,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:54,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:02:56,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:00,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:00,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:03,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:03:04,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:07,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 00:03:07,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 00:03:07,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 00:03:09,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=532280.0, ans=0.125 2023-09-30 00:03:10,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:03:10,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:03:10,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:03:12,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:03:12,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:13,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:13,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:14,676 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 00:03:17,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:17,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532346.6666666666, ans=0.1 2023-09-30 00:03:18,224 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.36 vs. limit=22.5 2023-09-30 00:03:22,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:24,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=532346.6666666666, ans=0.125 2023-09-30 00:03:26,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:03:28,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 00:03:31,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:03:31,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:31,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:03:33,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:03:36,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:37,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:03:38,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:38,707 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.88 vs. limit=15.0 2023-09-30 00:03:39,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 00:03:42,143 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.22 vs. limit=22.5 2023-09-30 00:03:44,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:46,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:03:46,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:03:46,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:03:49,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:52,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 00:03:55,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:03:58,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:04:00,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:03,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:04:04,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 00:04:05,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:04:05,035 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 00:04:08,509 INFO [train.py:1039] (0/4) Epoch 16, batch 200, loss[loss=0.1562, simple_loss=0.2313, pruned_loss=0.04053, over 24444.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2619, pruned_loss=0.05704, over 3008593.44 frames. ], batch size: 58, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:04:08,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=532546.6666666666, ans=0.125 2023-09-30 00:04:10,022 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.410e+02 1.995e+02 2.387e+02 2.784e+02 4.621e+02, threshold=4.773e+02, percent-clipped=1.0 2023-09-30 00:04:10,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:12,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:04:13,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:04:17,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 00:04:18,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:18,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:22,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 00:04:23,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:04:23,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:25,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:28,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:04:28,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:30,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:49,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:04:49,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:04:50,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:04:52,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:04:52,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:04:52,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:04:55,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:57,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:04:59,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:59,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:00,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 00:05:02,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:05:02,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:06,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:05:12,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:05:15,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=532813.3333333334, ans=0.0 2023-09-30 00:05:18,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:19,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:05:27,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:30,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 00:05:32,400 INFO [train.py:1039] (0/4) Epoch 16, batch 250, loss[loss=0.1896, simple_loss=0.2616, pruned_loss=0.05874, over 23293.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2622, pruned_loss=0.05742, over 3380771.35 frames. ], batch size: 93, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:05:32,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:32,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:05:32,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:33,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:05:34,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 00:05:34,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:05:34,296 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 00:05:36,164 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=532880.0, ans=0.0 2023-09-30 00:05:37,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:38,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:05:40,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:41,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:44,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:05:45,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:46,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:05:47,529 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.00 vs. limit=15.0 2023-09-30 00:05:50,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:06:03,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:05,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:05,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:06:11,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=533013.3333333334, ans=0.125 2023-09-30 00:06:12,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:06:12,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:06:13,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:06:15,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:15,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:06:15,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:06:16,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:19,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:06:22,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 00:06:22,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=533080.0, ans=0.2 2023-09-30 00:06:23,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:25,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:06:25,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:06:25,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:06:25,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:27,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:06:27,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:06:30,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:31,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:06:32,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:38,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:06:38,773 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=533146.6666666666, ans=0.125 2023-09-30 00:06:40,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:44,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:06:47,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=533146.6666666666, ans=0.0 2023-09-30 00:06:48,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:50,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:06:54,760 INFO [train.py:1039] (0/4) Epoch 16, batch 300, loss[loss=0.1837, simple_loss=0.2673, pruned_loss=0.05004, over 24044.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2601, pruned_loss=0.05693, over 3663129.06 frames. ], batch size: 80, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:06:54,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 00:06:55,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:55,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:56,944 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.879e+02 2.129e+02 2.398e+02 3.317e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-30 00:06:57,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 00:06:57,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=533213.3333333334, ans=0.1 2023-09-30 00:06:58,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:07:00,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:07:00,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 00:07:05,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:07:07,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:10,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:07:10,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 00:07:12,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:07:13,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:07:13,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 00:07:15,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:18,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:07:23,960 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-80000.pt 2023-09-30 00:07:27,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:07:27,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 00:07:30,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 00:07:30,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:30,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=533346.6666666666, ans=0.0 2023-09-30 00:07:33,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:35,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:35,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 00:07:35,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:07:37,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:07:39,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:07:41,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:07:47,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:07:47,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 00:07:48,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:07:52,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:53,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 00:07:54,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:59,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:02,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:08:02,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 00:08:06,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:06,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:08:08,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:11,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:08:11,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 00:08:11,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:08:13,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:15,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 00:08:17,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:18,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:18,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=533480.0, ans=0.125 2023-09-30 00:08:20,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:20,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:20,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:22,262 INFO [train.py:1039] (0/4) Epoch 16, batch 350, loss[loss=0.1836, simple_loss=0.2594, pruned_loss=0.05391, over 24676.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2586, pruned_loss=0.05621, over 3893444.79 frames. ], batch size: 65, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:08:26,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:26,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 00:08:28,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:34,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:37,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:39,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:40,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 00:08:42,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:42,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 00:08:46,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:46,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 00:08:48,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:51,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 00:08:53,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:08:55,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:57,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:58,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:08:58,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:00,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:00,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:01,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:09:03,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:03,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:09,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:09,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:09:10,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:09:10,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:12,995 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=22.5 2023-09-30 00:09:17,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 00:09:17,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:22,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:22,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:22,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:09:24,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 00:09:26,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:27,600 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 00:09:27,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 00:09:27,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:31,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:31,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 00:09:34,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:37,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:09:39,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:40,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:40,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:42,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:45,707 INFO [train.py:1039] (0/4) Epoch 16, batch 400, loss[loss=0.1966, simple_loss=0.2789, pruned_loss=0.05718, over 24026.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2578, pruned_loss=0.056, over 4078518.55 frames. ], batch size: 80, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:09:45,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:47,270 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.835e+02 1.997e+02 2.324e+02 4.354e+02, threshold=3.993e+02, percent-clipped=1.0 2023-09-30 00:09:47,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:09:48,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 00:09:48,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:49,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:51,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:09:52,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:56,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:57,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:59,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 00:10:01,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 00:10:01,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:03,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 00:10:03,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:04,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=533946.6666666666, ans=0.125 2023-09-30 00:10:07,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=533946.6666666666, ans=0.125 2023-09-30 00:10:08,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:10:08,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:08,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 00:10:10,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:10:10,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:10,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:11,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:10:13,165 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 00:10:14,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 00:10:19,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:20,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:22,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 00:10:23,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 00:10:27,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:10:31,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:38,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 00:10:44,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:10:44,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 00:10:45,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:47,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:10:47,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 00:10:52,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:10:55,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:10:56,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:58,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:58,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 00:11:00,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:11:01,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 00:11:01,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=534146.6666666666, ans=0.0 2023-09-30 00:11:03,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:11:05,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:11:06,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 00:11:08,080 INFO [train.py:1039] (0/4) Epoch 16, batch 450, loss[loss=0.1641, simple_loss=0.2519, pruned_loss=0.03817, over 24448.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2587, pruned_loss=0.05619, over 4219627.66 frames. ], batch size: 66, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:11:09,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:11:09,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:11:09,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:11:12,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 00:11:12,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:11:14,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:11:14,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:11:15,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 00:11:15,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:11:16,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:11:17,142 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=534213.3333333334, ans=0.2 2023-09-30 00:11:19,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:11:26,509 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=534280.0, ans=0.125 2023-09-30 00:11:29,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:29,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:11:30,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 00:11:30,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 00:11:36,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:11:37,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:40,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:43,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:45,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:46,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 00:11:47,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 00:11:49,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 00:11:49,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:11:51,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:52,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:11:54,516 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 00:11:54,529 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 00:11:54,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:56,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:11:57,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:12:00,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:12:00,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:12:00,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:12:02,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 00:12:05,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:07,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:12:07,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:12:09,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 00:12:14,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:12:15,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 00:12:15,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 00:12:17,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:24,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:12:26,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:27,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:12:27,685 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 00:12:30,620 INFO [train.py:1039] (0/4) Epoch 16, batch 500, loss[loss=0.1657, simple_loss=0.2364, pruned_loss=0.04749, over 24337.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2602, pruned_loss=0.05713, over 4323354.02 frames. ], batch size: 56, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:12:32,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:33,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:12:35,022 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.824e+02 2.052e+02 2.354e+02 3.367e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 00:12:35,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:35,176 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 00:12:35,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=534546.6666666666, ans=0.125 2023-09-30 00:12:36,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 00:12:36,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:39,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:12:44,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 00:12:44,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:12:48,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:48,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:49,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:03,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:03,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:13:03,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:13:03,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:04,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 00:13:04,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:13:08,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:13:08,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:13:08,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:13:08,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:09,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 00:13:12,823 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 00:13:14,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:14,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:13:20,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 00:13:24,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:13:25,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:31,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:35,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:41,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:44,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 00:13:44,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:44,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:47,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 00:13:47,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:13:48,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:51,952 INFO [train.py:1039] (0/4) Epoch 16, batch 550, loss[loss=0.1759, simple_loss=0.2505, pruned_loss=0.05064, over 23444.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2603, pruned_loss=0.057, over 4424275.36 frames. ], batch size: 105, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:13:55,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 00:13:57,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 00:13:57,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:57,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 00:13:59,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:13:59,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:59,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:14:01,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:14:04,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:14:05,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 00:14:06,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:14:07,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=534946.6666666666, ans=0.0 2023-09-30 00:14:11,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:11,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:14,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:15,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:17,740 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.77 vs. limit=15.0 2023-09-30 00:14:20,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 00:14:20,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 00:14:23,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:14:28,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:14:30,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:31,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:14:32,759 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.64 vs. limit=15.0 2023-09-30 00:14:34,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:34,964 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 00:14:35,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:36,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:14:39,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:39,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:14:39,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:14:40,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:42,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 00:14:43,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 00:14:45,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:14:45,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:45,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:14:45,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:14:48,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:14:51,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:14:54,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:14:55,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:55,585 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.65 vs. limit=22.5 2023-09-30 00:14:56,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 00:14:58,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:15:00,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:02,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:15:03,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:05,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:15:05,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:15:12,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 00:15:14,317 INFO [train.py:1039] (0/4) Epoch 16, batch 600, loss[loss=0.1615, simple_loss=0.241, pruned_loss=0.04097, over 22038.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.261, pruned_loss=0.05643, over 4503390.82 frames. ], batch size: 48, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:15:15,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 00:15:16,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:15:18,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:15:18,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:19,621 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.903e+02 2.137e+02 2.465e+02 5.407e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 00:15:26,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:15:28,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:15:29,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 00:15:31,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:15:34,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:15:37,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:38,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 00:15:40,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:15:40,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=535280.0, ans=0.1 2023-09-30 00:15:43,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 00:15:49,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:15:49,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:50,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:15:51,874 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=8.51 vs. limit=12.0 2023-09-30 00:15:56,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:15:56,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:15:57,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:03,810 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.52 vs. limit=6.0 2023-09-30 00:16:04,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:16:08,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:08,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:16:08,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:16:16,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 00:16:21,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:16:21,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:16:26,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 00:16:26,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:16:29,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 00:16:29,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:16:29,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:16:37,771 INFO [train.py:1039] (0/4) Epoch 16, batch 650, loss[loss=0.1553, simple_loss=0.2096, pruned_loss=0.05045, over 22650.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2603, pruned_loss=0.05695, over 4532231.28 frames. ], batch size: 322, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:16:37,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:16:41,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:16:43,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:16:44,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:16:46,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:16:46,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=535546.6666666666, ans=0.0 2023-09-30 00:16:49,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 00:16:49,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:54,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=535613.3333333334, ans=0.1 2023-09-30 00:16:55,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:16:55,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:16:56,254 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=535613.3333333334, ans=0.125 2023-09-30 00:16:58,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:02,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 00:17:04,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:06,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:06,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=535613.3333333334, ans=0.125 2023-09-30 00:17:06,916 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=535613.3333333334, ans=0.0 2023-09-30 00:17:09,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:09,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:17:12,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:14,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:16,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:17:16,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:18,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:17:20,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:17:20,124 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 00:17:20,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:20,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:24,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:26,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:26,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:27,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:17:29,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 00:17:29,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:17:29,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:17:29,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=535746.6666666666, ans=0.125 2023-09-30 00:17:31,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:17:31,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:32,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:17:34,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 00:17:34,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 00:17:34,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:34,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:36,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:17:36,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:39,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:47,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:47,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:49,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:53,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:53,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:17:54,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:58,906 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.29 vs. limit=15.0 2023-09-30 00:17:59,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:17:59,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:17:59,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:17:59,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:01,185 INFO [train.py:1039] (0/4) Epoch 16, batch 700, loss[loss=0.1985, simple_loss=0.2666, pruned_loss=0.06524, over 23607.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2587, pruned_loss=0.05658, over 4551448.47 frames. ], batch size: 135, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:18:05,524 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.452e+02 1.867e+02 2.174e+02 2.485e+02 3.899e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-30 00:18:05,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 00:18:07,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 00:18:09,802 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=535880.0, ans=0.1 2023-09-30 00:18:10,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 00:18:11,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:12,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:18:14,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 00:18:19,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:18:22,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:18:24,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=535946.6666666666, ans=0.125 2023-09-30 00:18:25,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:25,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:18:27,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:18:29,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:31,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:18:31,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:18:32,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 00:18:36,809 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.96 vs. limit=15.0 2023-09-30 00:18:37,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 00:18:41,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:18:41,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:18:44,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:18:46,074 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=536013.3333333334, ans=0.125 2023-09-30 00:18:49,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:18:49,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 00:18:54,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:56,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:18:56,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 00:19:01,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:19:01,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:04,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:09,843 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.28 vs. limit=22.5 2023-09-30 00:19:10,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:19:10,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 00:19:12,758 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.09 vs. limit=22.5 2023-09-30 00:19:15,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 00:19:15,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 00:19:18,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:20,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:21,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:24,044 INFO [train.py:1039] (0/4) Epoch 16, batch 750, loss[loss=0.1747, simple_loss=0.2533, pruned_loss=0.04807, over 24483.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2574, pruned_loss=0.05606, over 4583974.10 frames. ], batch size: 63, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:19:24,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:24,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 00:19:28,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 00:19:28,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 00:19:28,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 00:19:31,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 00:19:31,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 00:19:31,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:19:32,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 00:19:32,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:34,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:19:36,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:39,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:39,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:19:39,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:41,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:19:42,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:19:42,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=536280.0, ans=0.125 2023-09-30 00:19:44,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=536280.0, ans=0.125 2023-09-30 00:19:45,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:19:48,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:48,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:48,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 00:19:50,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:19:52,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:53,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:55,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:19:56,309 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.10 vs. limit=15.0 2023-09-30 00:19:56,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 00:19:57,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:59,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 00:19:59,293 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 00:19:59,531 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:20:00,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 00:20:00,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:20:02,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:20:05,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:20:07,979 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.23 vs. limit=12.0 2023-09-30 00:20:11,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:20:12,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:12,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:20:13,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=536413.3333333334, ans=0.0 2023-09-30 00:20:13,315 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=536413.3333333334, ans=0.125 2023-09-30 00:20:14,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:20:17,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:17,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 00:20:17,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:20:19,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 00:20:19,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=536413.3333333334, ans=0.125 2023-09-30 00:20:20,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:20:23,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:20:23,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 00:20:25,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:30,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:20:32,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:20:32,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:34,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:20:38,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 00:20:38,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:40,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:41,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:43,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:44,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:46,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:20:47,964 INFO [train.py:1039] (0/4) Epoch 16, batch 800, loss[loss=0.2031, simple_loss=0.2701, pruned_loss=0.06803, over 23770.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2584, pruned_loss=0.05595, over 4616638.78 frames. ], batch size: 212, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:20:52,613 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.946e+02 2.133e+02 2.496e+02 4.467e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 00:20:54,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:54,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:55,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:55,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:57,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:59,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:59,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:03,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=536613.3333333334, ans=0.2 2023-09-30 00:21:04,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:05,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:21:09,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 00:21:10,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:13,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:21:13,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:21:13,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:13,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 00:21:14,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:15,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 00:21:19,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:22,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:25,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:21:26,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:28,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:28,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:30,653 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.89 vs. limit=15.0 2023-09-30 00:21:32,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:21:32,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:21:34,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 00:21:37,336 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 00:21:37,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 00:21:37,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:21:37,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:21:39,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:39,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:21:45,314 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.34 vs. limit=15.0 2023-09-30 00:21:46,403 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 00:21:46,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 00:21:48,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:21:48,708 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.90 vs. limit=15.0 2023-09-30 00:21:51,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:21:53,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:21:55,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=536813.3333333334, ans=0.125 2023-09-30 00:21:58,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:59,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 00:21:59,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:22:02,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 00:22:07,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=536880.0, ans=0.0 2023-09-30 00:22:08,779 INFO [train.py:1039] (0/4) Epoch 16, batch 850, loss[loss=0.1824, simple_loss=0.2702, pruned_loss=0.04731, over 24435.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2593, pruned_loss=0.05589, over 4654960.52 frames. ], batch size: 69, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:22:10,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:11,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:22:13,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 00:22:13,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:22:15,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:17,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 00:22:17,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:17,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=536880.0, ans=0.125 2023-09-30 00:22:18,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:22:19,404 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.88 vs. limit=12.0 2023-09-30 00:22:20,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:21,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:22:21,714 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.82 vs. limit=15.0 2023-09-30 00:22:24,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:22:25,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 00:22:25,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 00:22:25,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 00:22:27,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:28,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:22:30,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:30,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:30,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:22:35,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:35,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:22:35,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 00:22:38,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 00:22:43,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:44,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 00:22:46,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 00:22:50,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 00:22:53,503 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 00:22:53,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:22:53,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:22:53,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:22:57,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 00:23:00,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:23:02,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:03,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:23:03,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:23:05,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:23:06,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:23:07,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 00:23:12,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:23:12,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:13,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:23:13,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:14,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:16,796 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=537146.6666666666, ans=0.125 2023-09-30 00:23:17,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:23:19,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:23:21,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:23:22,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:22,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:23:29,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:23:31,197 INFO [train.py:1039] (0/4) Epoch 16, batch 900, loss[loss=0.1945, simple_loss=0.2592, pruned_loss=0.06492, over 23559.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2594, pruned_loss=0.05563, over 4683635.83 frames. ], batch size: 134, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:23:31,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:31,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 00:23:31,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:32,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:33,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 00:23:36,670 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.050e+02 2.390e+02 2.977e+02 4.145e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-30 00:23:38,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:23:41,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:42,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 00:23:46,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:23:46,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 00:23:48,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:23:49,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:49,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:23:49,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:23:49,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:23:53,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=537280.0, ans=0.125 2023-09-30 00:24:02,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:02,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:24:02,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:24:05,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:11,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 00:24:14,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:24:18,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:24:18,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:24:20,100 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 00:24:21,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 00:24:27,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:24:27,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:24:28,805 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.14 vs. limit=22.5 2023-09-30 00:24:30,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:24:36,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:36,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:24:39,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 00:24:39,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:41,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 00:24:43,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:24:43,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:45,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:24:45,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:24:50,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 00:24:50,621 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 00:24:52,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:24:52,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 00:24:54,974 INFO [train.py:1039] (0/4) Epoch 16, batch 950, loss[loss=0.1732, simple_loss=0.2472, pruned_loss=0.0496, over 21349.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2602, pruned_loss=0.05565, over 4691492.34 frames. ], batch size: 46, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:24:55,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:59,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 00:24:59,806 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=537546.6666666666, ans=0.0 2023-09-30 00:25:01,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=537546.6666666666, ans=0.0 2023-09-30 00:25:04,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:08,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:08,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:09,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:25:12,914 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 00:25:16,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:16,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:17,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:17,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:25:17,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 00:25:19,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:25:21,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:21,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 00:25:21,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:27,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:25:29,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 00:25:30,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:25:32,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:34,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=537680.0, ans=0.125 2023-09-30 00:25:35,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:25:35,440 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=537680.0, ans=0.0 2023-09-30 00:25:40,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:25:40,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:45,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 00:25:46,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:25:46,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:25:48,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:48,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:48,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:25:53,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 00:25:54,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:25:55,157 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=537746.6666666666, ans=0.125 2023-09-30 00:25:58,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:59,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:59,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 00:25:59,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:59,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:25:59,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 00:26:04,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:26:07,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:26:07,904 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=537813.3333333334, ans=0.125 2023-09-30 00:26:10,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:13,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 00:26:13,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 00:26:15,999 INFO [train.py:1039] (0/4) Epoch 16, batch 1000, loss[loss=0.1948, simple_loss=0.2593, pruned_loss=0.06516, over 23790.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2589, pruned_loss=0.05575, over 4687789.10 frames. ], batch size: 179, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:26:16,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:26:20,689 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 2.021e+02 2.226e+02 2.513e+02 3.322e+02, threshold=4.453e+02, percent-clipped=0.0 2023-09-30 00:26:20,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 00:26:20,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:24,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:26:26,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 00:26:26,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 00:26:31,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:31,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:33,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:34,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 00:26:37,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=537946.6666666666, ans=0.1 2023-09-30 00:26:40,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 00:26:40,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=537946.6666666666, ans=0.2 2023-09-30 00:26:42,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 00:26:43,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:26:46,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 00:26:46,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 00:26:46,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 00:26:49,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:50,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:57,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:59,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:27:01,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:01,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:01,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 00:27:01,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:02,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:27:02,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:27:04,462 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 00:27:06,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 00:27:07,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 00:27:09,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 00:27:10,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:27:15,420 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.94 vs. limit=15.0 2023-09-30 00:27:17,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:17,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:27:19,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:20,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:27:21,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 00:27:23,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:27:23,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 00:27:24,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 00:27:26,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:27:26,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:27,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:27:31,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:27:33,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:34,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=538146.6666666666, ans=0.2 2023-09-30 00:27:36,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:27:36,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=538146.6666666666, ans=0.0 2023-09-30 00:27:37,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:27:39,062 INFO [train.py:1039] (0/4) Epoch 16, batch 1050, loss[loss=0.1975, simple_loss=0.2734, pruned_loss=0.06077, over 24393.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2573, pruned_loss=0.05552, over 4675267.47 frames. ], batch size: 77, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:27:39,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:27:40,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:43,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:27:47,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:27:48,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:27:52,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:27:53,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:27:53,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:27:55,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:27:55,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 00:27:57,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:27:58,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 00:28:00,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:28:01,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 00:28:01,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:28:05,613 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=538280.0, ans=0.0 2023-09-30 00:28:07,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:28:09,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:28:09,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:28:09,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=538280.0, ans=0.2 2023-09-30 00:28:09,917 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.07 vs. limit=15.0 2023-09-30 00:28:12,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 00:28:12,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 00:28:12,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:28:15,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 00:28:19,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 00:28:19,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:22,499 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.01 vs. limit=22.5 2023-09-30 00:28:23,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:28:27,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:28:27,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:28:28,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:28:29,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=538413.3333333334, ans=0.125 2023-09-30 00:28:31,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:28:36,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 00:28:37,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 00:28:38,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 00:28:38,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:39,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:28:40,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 00:28:40,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=538413.3333333334, ans=0.0 2023-09-30 00:28:40,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=538413.3333333334, ans=0.125 2023-09-30 00:28:42,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=538413.3333333334, ans=0.125 2023-09-30 00:28:44,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:28:47,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:47,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:28:49,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:49,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:49,564 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=538480.0, ans=0.125 2023-09-30 00:28:54,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:54,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 00:28:55,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:55,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 00:28:57,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 00:28:57,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:28:57,540 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:29:00,851 INFO [train.py:1039] (0/4) Epoch 16, batch 1100, loss[loss=0.1841, simple_loss=0.2718, pruned_loss=0.0482, over 24538.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2578, pruned_loss=0.0553, over 4687327.78 frames. ], batch size: 71, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:29:01,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:03,034 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=538546.6666666666, ans=0.0 2023-09-30 00:29:06,282 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.914e+02 2.113e+02 2.523e+02 4.579e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-30 00:29:07,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:29:14,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:29:15,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:29:15,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:17,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 00:29:19,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:29:21,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:29:24,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:29:25,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:29:25,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 00:29:26,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=538613.3333333334, ans=10.0 2023-09-30 00:29:27,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:29:27,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:29,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:29:31,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:29:33,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=538680.0, ans=0.125 2023-09-30 00:29:34,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:29:39,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:29:44,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 00:29:45,781 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 00:29:45,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:47,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:49,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:29:49,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:51,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 00:29:52,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:29:52,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:29:52,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:29:52,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:54,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 00:29:59,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:29:59,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 00:29:59,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=538746.6666666666, ans=0.125 2023-09-30 00:30:02,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:30:08,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:30:12,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 00:30:12,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:30:14,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:15,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:15,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:18,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 00:30:19,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:30:19,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:21,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 00:30:21,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:30:21,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=538813.3333333334, ans=0.1 2023-09-30 00:30:22,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 00:30:24,581 INFO [train.py:1039] (0/4) Epoch 16, batch 1150, loss[loss=0.1485, simple_loss=0.2241, pruned_loss=0.03649, over 24287.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.258, pruned_loss=0.0552, over 4688345.08 frames. ], batch size: 56, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:30:24,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:30:24,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:30:26,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:30:31,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:34,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:30:36,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:36,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:30:36,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 00:30:36,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:30:39,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 00:30:39,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:39,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:30:46,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 00:30:46,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=538946.6666666666, ans=0.2 2023-09-30 00:30:47,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:53,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:54,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:30:54,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 00:30:56,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:30:56,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:31:01,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 00:31:01,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:02,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:31:04,128 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.21 vs. limit=15.0 2023-09-30 00:31:11,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=539013.3333333334, ans=0.0 2023-09-30 00:31:13,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:19,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:19,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 00:31:20,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:21,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:27,798 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 00:31:29,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:37,325 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 00:31:41,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:42,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:31:42,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:31:43,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=539146.6666666666, ans=0.125 2023-09-30 00:31:44,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:31:47,047 INFO [train.py:1039] (0/4) Epoch 16, batch 1200, loss[loss=0.1636, simple_loss=0.2444, pruned_loss=0.04137, over 24474.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2589, pruned_loss=0.05519, over 4703665.16 frames. ], batch size: 63, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:31:48,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:31:53,128 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.828e+02 2.089e+02 2.357e+02 3.548e+02, threshold=4.177e+02, percent-clipped=0.0 2023-09-30 00:31:55,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:31:55,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:31:55,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=539213.3333333334, ans=0.2 2023-09-30 00:31:57,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:57,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:58,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:32:00,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:32:02,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:32:03,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:03,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:07,025 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 00:32:10,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 00:32:13,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:32:16,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:32:19,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:22,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:32:22,087 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 00:32:22,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:27,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=539346.6666666666, ans=0.0 2023-09-30 00:32:28,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:32:29,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:32:29,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 00:32:29,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:32:30,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=539346.6666666666, ans=0.125 2023-09-30 00:32:34,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 00:32:38,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 00:32:40,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:41,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:43,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=539413.3333333334, ans=0.125 2023-09-30 00:32:44,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:44,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:32:46,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:46,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:32:47,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:32:48,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 00:32:48,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:32:48,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:32:48,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:32:50,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:50,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:52,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=539480.0, ans=0.125 2023-09-30 00:32:55,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:32:55,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=539480.0, ans=0.2 2023-09-30 00:32:57,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:32:59,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=539480.0, ans=0.125 2023-09-30 00:33:01,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 00:33:04,936 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 00:33:09,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:10,643 INFO [train.py:1039] (0/4) Epoch 16, batch 1250, loss[loss=0.1893, simple_loss=0.2647, pruned_loss=0.05692, over 23459.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2606, pruned_loss=0.05636, over 4703115.18 frames. ], batch size: 93, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:33:12,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:33:13,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:33:15,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:33:17,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 00:33:22,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:33:22,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:22,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 00:33:23,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:33:25,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:33:30,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:33:30,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:32,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:33:32,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:35,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:33:36,872 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=539613.3333333334, ans=0.125 2023-09-30 00:33:38,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 00:33:38,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:33:38,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:41,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:43,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:44,670 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.90 vs. limit=12.0 2023-09-30 00:33:46,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:33:48,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:33:52,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 00:33:53,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:33:55,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:33:56,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 00:33:58,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:58,239 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 00:33:58,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:58,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:01,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:01,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=539746.6666666666, ans=0.1 2023-09-30 00:34:05,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=539746.6666666666, ans=0.125 2023-09-30 00:34:06,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:06,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:34:08,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 00:34:08,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 00:34:08,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 00:34:11,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:11,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=539746.6666666666, ans=0.125 2023-09-30 00:34:13,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 00:34:13,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:13,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=539746.6666666666, ans=0.1 2023-09-30 00:34:15,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:34:15,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:34:18,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 00:34:18,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:34:18,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:34:19,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:34:20,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:34:23,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 00:34:27,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:28,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:34:29,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:34:31,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:34:33,157 INFO [train.py:1039] (0/4) Epoch 16, batch 1300, loss[loss=0.168, simple_loss=0.2529, pruned_loss=0.04152, over 24673.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2605, pruned_loss=0.05556, over 4726477.53 frames. ], batch size: 65, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:34:36,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:36,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 00:34:39,922 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.902e+02 2.089e+02 2.370e+02 3.462e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 00:34:41,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:43,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:34:43,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:34:46,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:48,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:34:48,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 00:34:53,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:34:53,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:34:55,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 00:34:57,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=539946.6666666666, ans=0.0 2023-09-30 00:35:00,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:35:01,180 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.04 vs. limit=15.0 2023-09-30 00:35:03,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:05,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:06,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:35:08,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:08,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:35:08,981 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.79 vs. limit=15.0 2023-09-30 00:35:09,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:35:09,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 00:35:16,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:35:17,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:35:19,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 00:35:19,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:35:21,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:35:25,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:35:26,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 00:35:26,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:26,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 00:35:26,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=540080.0, ans=0.125 2023-09-30 00:35:28,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:33,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:33,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:35:33,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=540080.0, ans=10.0 2023-09-30 00:35:36,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 00:35:36,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=540080.0, ans=0.1 2023-09-30 00:35:38,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 00:35:39,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 00:35:42,164 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.40 vs. limit=15.0 2023-09-30 00:35:42,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:35:45,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 00:35:47,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:49,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=540146.6666666666, ans=0.125 2023-09-30 00:35:54,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 00:35:56,306 INFO [train.py:1039] (0/4) Epoch 16, batch 1350, loss[loss=0.1631, simple_loss=0.2377, pruned_loss=0.04424, over 24576.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2589, pruned_loss=0.05506, over 4735489.43 frames. ], batch size: 60, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:35:59,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:02,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:06,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:36:07,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:09,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:36:09,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:09,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=540213.3333333334, ans=0.125 2023-09-30 00:36:12,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:13,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 00:36:15,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:16,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:36:18,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 00:36:19,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:36:21,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:36:21,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 00:36:23,154 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=540280.0, ans=0.125 2023-09-30 00:36:24,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 00:36:28,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 00:36:29,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:29,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 00:36:38,195 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=540346.6666666666, ans=0.0 2023-09-30 00:36:42,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:44,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=540413.3333333334, ans=0.0 2023-09-30 00:36:52,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:52,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:52,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 00:36:55,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:55,901 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.06 vs. limit=12.0 2023-09-30 00:36:58,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 00:36:58,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:58,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:37:02,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:37:04,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 00:37:07,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:37:13,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 00:37:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 00:37:17,854 INFO [train.py:1039] (0/4) Epoch 16, batch 1400, loss[loss=0.1661, simple_loss=0.2438, pruned_loss=0.04421, over 24465.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2572, pruned_loss=0.05512, over 4708788.38 frames. ], batch size: 63, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:37:19,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 00:37:21,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=540546.6666666666, ans=0.1 2023-09-30 00:37:22,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:37:23,985 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.842e+02 1.998e+02 2.370e+02 3.291e+02, threshold=3.996e+02, percent-clipped=0.0 2023-09-30 00:37:24,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:37:24,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:37:32,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 00:37:33,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 00:37:44,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:37:45,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:37:49,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:37:49,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:37:54,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:37:55,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:38:03,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:04,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:09,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 00:38:10,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:38:12,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:38:12,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:38:13,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:15,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:38:15,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:38:15,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:38:15,605 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=540746.6666666666, ans=0.125 2023-09-30 00:38:15,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=540746.6666666666, ans=0.0 2023-09-30 00:38:16,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 00:38:16,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:38:22,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:25,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:38:33,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 00:38:33,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:38:34,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:38:36,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:38:38,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:39,965 INFO [train.py:1039] (0/4) Epoch 16, batch 1450, loss[loss=0.1985, simple_loss=0.2594, pruned_loss=0.06874, over 22737.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2571, pruned_loss=0.05555, over 4708403.67 frames. ], batch size: 322, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:38:40,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:38:43,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:38:45,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:38:45,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:45,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:38:50,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:50,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=540880.0, ans=0.125 2023-09-30 00:38:51,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:38:53,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:54,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 00:38:55,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:38:57,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 00:38:57,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:58,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:38:58,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 00:39:00,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:00,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:39:01,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 00:39:01,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:03,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:39:04,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:07,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:10,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:39:10,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:39:13,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:39:13,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:15,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:15,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:39:16,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:16,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:20,415 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=22.5 2023-09-30 00:39:21,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 00:39:24,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:28,523 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 00:39:30,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:30,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:39:31,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:33,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 00:39:37,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:39,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 00:39:40,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 00:39:42,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:43,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:39:45,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:48,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 00:39:48,957 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.42 vs. limit=10.0 2023-09-30 00:39:51,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 00:39:51,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 00:39:52,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:55,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:40:00,190 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.30 vs. limit=15.0 2023-09-30 00:40:02,490 INFO [train.py:1039] (0/4) Epoch 16, batch 1500, loss[loss=0.1934, simple_loss=0.2777, pruned_loss=0.05457, over 24380.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2582, pruned_loss=0.05618, over 4697166.68 frames. ], batch size: 77, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:40:06,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 00:40:07,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:40:07,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:40:09,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:10,587 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.398e+02 1.908e+02 2.053e+02 2.386e+02 4.299e+02, threshold=4.105e+02, percent-clipped=2.0 2023-09-30 00:40:10,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:10,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:40:12,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 00:40:13,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:40:13,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:40:13,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:15,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:40:16,417 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.75 vs. limit=15.0 2023-09-30 00:40:18,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:40:19,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:19,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=541280.0, ans=0.025 2023-09-30 00:40:23,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:23,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 00:40:25,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:40:25,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:40:26,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:29,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 00:40:35,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 00:40:37,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:39,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 00:40:40,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:40:43,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:40:44,515 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.56 vs. limit=10.0 2023-09-30 00:40:45,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:45,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:40:46,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 00:40:47,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:40:47,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:48,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 00:40:48,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:55,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:40:55,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 00:41:01,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:41:03,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:41:08,055 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 00:41:08,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:08,150 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 00:41:10,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:11,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:13,785 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 00:41:15,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:41:18,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 00:41:19,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:22,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:22,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:23,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:24,361 INFO [train.py:1039] (0/4) Epoch 16, batch 1550, loss[loss=0.2433, simple_loss=0.3003, pruned_loss=0.09321, over 19542.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2588, pruned_loss=0.05606, over 4700588.44 frames. ], batch size: 388, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:41:24,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:24,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:41:26,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 00:41:26,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 00:41:26,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:41:27,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 00:41:27,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 00:41:31,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:32,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:34,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:41:34,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:41:35,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:35,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:39,566 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 00:41:39,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:41,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:41:41,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:41:44,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:41:44,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 00:41:46,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:46,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 00:41:46,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=541613.3333333334, ans=0.0 2023-09-30 00:41:47,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 00:41:47,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 00:41:49,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:49,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:41:54,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:55,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 00:41:55,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 00:41:56,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=541680.0, ans=0.0 2023-09-30 00:42:06,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:10,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:42:10,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:42:10,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:42:12,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 00:42:17,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:42:18,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:22,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:42:25,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:42:25,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:25,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 00:42:27,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:28,110 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.79 vs. limit=15.0 2023-09-30 00:42:28,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:42:28,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:30,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:42:30,357 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 00:42:33,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:38,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 00:42:44,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:44,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=541880.0, ans=0.05 2023-09-30 00:42:45,151 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.68 vs. limit=15.0 2023-09-30 00:42:45,944 INFO [train.py:1039] (0/4) Epoch 16, batch 1600, loss[loss=0.1889, simple_loss=0.2534, pruned_loss=0.06215, over 23824.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2593, pruned_loss=0.05592, over 4711744.16 frames. ], batch size: 164, lr: 6.50e-03, grad_scale: 16.0 2023-09-30 00:42:46,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:46,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 00:42:46,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:48,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:48,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:42:48,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:42:49,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:42:53,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:54,298 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.801e+02 1.974e+02 2.195e+02 3.172e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 00:42:54,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 00:42:56,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 00:42:59,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 00:43:01,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=541946.6666666666, ans=0.125 2023-09-30 00:43:02,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:04,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 00:43:04,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:43:07,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:43:11,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:43:13,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 00:43:13,979 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.28 vs. limit=22.5 2023-09-30 00:43:16,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:43:17,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 00:43:17,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:19,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 00:43:25,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 00:43:33,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:33,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 00:43:34,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:35,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:35,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:43:38,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 00:43:41,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 00:43:42,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:43:44,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:43:47,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:43:49,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:43:50,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:43:57,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:59,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:44:02,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 00:44:02,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:44:03,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 00:44:06,898 INFO [train.py:1039] (0/4) Epoch 16, batch 1650, loss[loss=0.178, simple_loss=0.2604, pruned_loss=0.04782, over 24570.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2601, pruned_loss=0.05637, over 4707444.56 frames. ], batch size: 71, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:44:10,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:11,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:11,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:44:11,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 00:44:11,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 00:44:11,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 00:44:11,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 00:44:14,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:44:15,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:16,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:44:16,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:44:17,262 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.18 vs. limit=15.0 2023-09-30 00:44:19,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:21,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 00:44:22,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:44:24,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:24,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:44:24,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:44:27,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 00:44:27,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 00:44:32,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:44:35,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:44:39,528 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.78 vs. limit=10.0 2023-09-30 00:44:42,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 00:44:42,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:45,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 00:44:47,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:44:50,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:44:51,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:44:51,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:44:52,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:52,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:56,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:56,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:58,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:44:58,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:00,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:01,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:45:05,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:45:07,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 00:45:07,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:08,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 00:45:11,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 00:45:11,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 00:45:11,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:12,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:45:12,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:12,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:45:12,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 00:45:17,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:18,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:45:18,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:19,095 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:45:21,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 00:45:26,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:26,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:45:26,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 00:45:28,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:28,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:45:28,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:30,571 INFO [train.py:1039] (0/4) Epoch 16, batch 1700, loss[loss=0.1947, simple_loss=0.2769, pruned_loss=0.05625, over 24558.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2591, pruned_loss=0.05601, over 4715571.89 frames. ], batch size: 71, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:45:32,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:45:33,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:45:33,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 00:45:37,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:45:40,401 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.927e+02 2.213e+02 2.603e+02 4.204e+02, threshold=4.426e+02, percent-clipped=1.0 2023-09-30 00:45:45,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:48,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:45:52,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=542613.3333333334, ans=0.1 2023-09-30 00:45:53,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:45:53,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:45:55,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:55,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:45:55,694 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.72 vs. limit=22.5 2023-09-30 00:45:58,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 00:46:01,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:46:01,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:02,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:46:03,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=542680.0, ans=0.2 2023-09-30 00:46:04,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:46:06,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 00:46:08,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 00:46:10,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:12,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 00:46:13,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:46:20,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:22,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:22,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:46:22,599 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=542746.6666666666, ans=0.0 2023-09-30 00:46:23,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:46:23,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 00:46:24,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:46:27,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:27,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 00:46:27,324 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:46:27,589 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.60 vs. limit=10.0 2023-09-30 00:46:28,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:46:28,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:28,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:28,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:28,614 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=542746.6666666666, ans=0.0 2023-09-30 00:46:31,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=542746.6666666666, ans=0.1 2023-09-30 00:46:32,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:32,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:46:33,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:35,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:46:35,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:40,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:42,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 00:46:45,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:45,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:47,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 00:46:53,071 INFO [train.py:1039] (0/4) Epoch 16, batch 1750, loss[loss=0.1926, simple_loss=0.2494, pruned_loss=0.06787, over 22713.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2576, pruned_loss=0.05576, over 4702622.81 frames. ], batch size: 322, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:46:53,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:58,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:58,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:46:58,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 00:46:58,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:47:00,345 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=542880.0, ans=0.125 2023-09-30 00:47:01,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:47:01,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:06,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 00:47:08,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:10,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 00:47:10,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:11,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:47:14,123 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=542946.6666666666, ans=0.0 2023-09-30 00:47:15,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:47:16,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 00:47:18,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:47:18,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 00:47:28,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:47:32,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:47:32,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:35,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:35,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:38,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:47:38,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:43,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:43,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:44,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 00:47:47,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:50,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 00:47:50,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:47:51,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:53,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:47:57,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:47:57,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:47:57,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:59,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=543146.6666666666, ans=0.0 2023-09-30 00:48:00,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:48:03,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:06,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:08,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:48:08,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 00:48:08,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:10,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:48:10,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:10,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:48:10,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:48:11,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:48:14,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:48:16,446 INFO [train.py:1039] (0/4) Epoch 16, batch 1800, loss[loss=0.2071, simple_loss=0.2744, pruned_loss=0.06995, over 23322.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2575, pruned_loss=0.05544, over 4714475.84 frames. ], batch size: 119, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:48:17,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:48:19,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:48:21,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:25,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=543213.3333333334, ans=0.125 2023-09-30 00:48:26,085 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.882e+02 2.134e+02 2.523e+02 4.257e+02, threshold=4.267e+02, percent-clipped=0.0 2023-09-30 00:48:26,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 00:48:26,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:48:31,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:34,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:34,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:36,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:48:38,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=543280.0, ans=0.125 2023-09-30 00:48:39,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:39,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 00:48:41,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:44,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:47,968 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.30 vs. limit=15.0 2023-09-30 00:48:48,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 00:48:50,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 00:48:50,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 00:48:50,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:52,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:52,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:52,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:48:54,684 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.69 vs. limit=22.5 2023-09-30 00:49:00,854 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 00:49:03,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:49:05,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:07,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=543413.3333333334, ans=0.025 2023-09-30 00:49:08,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 00:49:08,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 00:49:08,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:49:09,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:49:11,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:49:11,681 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.92 vs. limit=15.0 2023-09-30 00:49:16,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 00:49:22,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=543480.0, ans=0.125 2023-09-30 00:49:23,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:49:24,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 00:49:24,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:49:24,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:24,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:49:26,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 00:49:29,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:49:29,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:49:34,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 00:49:34,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:37,839 INFO [train.py:1039] (0/4) Epoch 16, batch 1850, loss[loss=0.1828, simple_loss=0.265, pruned_loss=0.05024, over 24619.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2585, pruned_loss=0.05551, over 4737021.84 frames. ], batch size: 68, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:49:37,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:37,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:49:37,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:49:42,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:49:42,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:46,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:49:48,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:49:55,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:49:55,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 00:49:59,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 00:50:02,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 00:50:06,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:06,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 00:50:06,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:50:18,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:50:20,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 00:50:24,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:50:24,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:28,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 00:50:29,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:29,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:50:29,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:50:30,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:50:33,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:50:37,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:50:37,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:38,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:50:38,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:40,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:42,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:50:45,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 00:50:45,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:49,926 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.16 vs. limit=15.0 2023-09-30 00:50:50,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:50:50,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:50:50,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 00:50:50,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 00:50:52,943 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 00:50:54,922 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 00:50:56,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:50:56,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:56,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:50:58,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:58,823 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.54 vs. limit=22.5 2023-09-30 00:50:59,480 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 00:50:59,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:50:59,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:59,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=543880.0, ans=0.1 2023-09-30 00:51:00,953 INFO [train.py:1039] (0/4) Epoch 16, batch 1900, loss[loss=0.2001, simple_loss=0.2659, pruned_loss=0.06709, over 22855.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2588, pruned_loss=0.05588, over 4728636.12 frames. ], batch size: 322, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:51:01,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:51:01,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=543880.0, ans=0.0 2023-09-30 00:51:02,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:51:02,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:51:02,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 00:51:05,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:05,815 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 00:51:05,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:51:07,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:09,128 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=543880.0, ans=0.2 2023-09-30 00:51:10,290 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.936e+02 2.154e+02 2.566e+02 3.893e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-30 00:51:11,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:15,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:51:16,982 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 00:51:17,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 00:51:19,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:51:20,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:51:20,048 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 00:51:22,072 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 00:51:25,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 00:51:27,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:51:31,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 00:51:34,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 00:51:45,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 00:51:46,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 00:51:46,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:48,327 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 00:51:48,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 00:51:48,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 00:51:49,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 00:51:49,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:51:54,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 00:51:58,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:52:00,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:00,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 00:52:02,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:52:04,143 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=544080.0, ans=0.125 2023-09-30 00:52:05,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 00:52:06,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:12,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:52:12,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:52:12,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:52:12,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:52:13,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:52:13,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 00:52:15,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:52:18,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:18,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:21,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:52:21,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:21,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:23,318 INFO [train.py:1039] (0/4) Epoch 16, batch 1950, loss[loss=0.1744, simple_loss=0.26, pruned_loss=0.0444, over 24650.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2598, pruned_loss=0.05613, over 4723343.09 frames. ], batch size: 68, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:52:23,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:26,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:30,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:52:30,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:30,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:52:31,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 00:52:33,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:52:33,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:35,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:37,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:52:37,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:37,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:40,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:52:45,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:45,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:52:45,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:52:45,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:49,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:53,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:53,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:54,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:52:54,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 00:52:54,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:52:55,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:52:55,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:56,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=544346.6666666666, ans=0.125 2023-09-30 00:52:59,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:02,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:53:09,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:53:14,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:53:15,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:53:15,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 00:53:16,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:19,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:53:21,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:53:22,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:24,643 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=544413.3333333334, ans=0.0 2023-09-30 00:53:29,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:30,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:32,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:32,667 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=544480.0, ans=0.125 2023-09-30 00:53:34,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:37,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:53:37,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:37,979 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.56 vs. limit=10.0 2023-09-30 00:53:39,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 00:53:39,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:53:41,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:41,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 00:53:44,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:53:46,223 INFO [train.py:1039] (0/4) Epoch 16, batch 2000, loss[loss=0.1873, simple_loss=0.2504, pruned_loss=0.06211, over 23771.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2609, pruned_loss=0.05613, over 4718030.50 frames. ], batch size: 212, lr: 6.49e-03, grad_scale: 16.0 2023-09-30 00:53:47,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:49,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:53:50,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:51,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:53:53,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=544546.6666666666, ans=0.2 2023-09-30 00:53:54,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:55,957 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.874e+02 2.052e+02 2.476e+02 4.888e+02, threshold=4.104e+02, percent-clipped=2.0 2023-09-30 00:53:57,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 00:53:57,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:54:00,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:54:03,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 00:54:03,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:54:05,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:54:08,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:54:10,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 00:54:10,281 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=544613.3333333334, ans=0.125 2023-09-30 00:54:12,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:16,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 00:54:16,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:54:17,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 00:54:17,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:20,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:54:22,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:54:22,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:22,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:24,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:24,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 00:54:26,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 00:54:26,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:26,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:33,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:35,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:54:35,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:36,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:54:39,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:41,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:41,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:41,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:42,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:46,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:46,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 00:54:46,837 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=544746.6666666666, ans=0.125 2023-09-30 00:54:52,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:54:52,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:57,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:57,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:55:01,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=544813.3333333334, ans=0.0 2023-09-30 00:55:02,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:02,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:02,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:03,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:55:03,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:55:06,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:07,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:08,442 INFO [train.py:1039] (0/4) Epoch 16, batch 2050, loss[loss=0.1907, simple_loss=0.2607, pruned_loss=0.06041, over 23624.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2598, pruned_loss=0.05623, over 4714052.55 frames. ], batch size: 149, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:55:10,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:11,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:15,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=544880.0, ans=0.0 2023-09-30 00:55:18,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:55:21,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:55:23,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:23,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:55:24,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 00:55:24,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:55:25,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=544946.6666666666, ans=0.0 2023-09-30 00:55:26,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:55:26,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:55:29,488 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.40 vs. limit=15.0 2023-09-30 00:55:35,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=544946.6666666666, ans=0.125 2023-09-30 00:55:38,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:38,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:39,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 00:55:42,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:44,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 00:55:44,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:47,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:49,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:55:51,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:55:52,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:54,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:55:54,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:55:54,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:55:54,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=545013.3333333334, ans=0.125 2023-09-30 00:56:00,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:01,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:56:03,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:56:05,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:07,653 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.74 vs. limit=15.0 2023-09-30 00:56:08,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:13,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:56:14,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 00:56:18,221 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=545146.6666666666, ans=0.125 2023-09-30 00:56:18,694 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=545146.6666666666, ans=6.0 2023-09-30 00:56:19,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:21,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:56:23,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:56:26,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 00:56:31,387 INFO [train.py:1039] (0/4) Epoch 16, batch 2100, loss[loss=0.1778, simple_loss=0.2698, pruned_loss=0.04287, over 24513.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2576, pruned_loss=0.05568, over 4703273.66 frames. ], batch size: 71, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:56:31,455 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 00:56:31,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:31,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:33,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:34,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:34,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 00:56:34,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 00:56:37,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:41,270 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.889e+02 2.067e+02 2.438e+02 3.667e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 00:56:41,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:56:41,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:56:44,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:46,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:56:46,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 00:56:47,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:56:47,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 00:56:47,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 00:56:49,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:56:50,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:56:50,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 00:56:50,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:56:55,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 00:56:55,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:59,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:59,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:57:04,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:57:04,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 00:57:06,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:06,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:57:08,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 00:57:09,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:09,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 00:57:11,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 00:57:11,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 00:57:14,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:57:16,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:57:19,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:20,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:22,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:23,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:23,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 00:57:23,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:23,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:25,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:25,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 00:57:25,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=545413.3333333334, ans=0.09899494936611666 2023-09-30 00:57:25,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=545413.3333333334, ans=0.125 2023-09-30 00:57:26,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 00:57:28,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 00:57:30,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:57:30,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=545413.3333333334, ans=0.0 2023-09-30 00:57:34,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:57:34,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 00:57:41,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:44,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:57:46,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:57:46,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:57:46,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:57:46,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:57:48,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:48,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:57:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:57:50,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:51,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 00:57:53,092 INFO [train.py:1039] (0/4) Epoch 16, batch 2150, loss[loss=0.1811, simple_loss=0.2669, pruned_loss=0.0477, over 24381.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2564, pruned_loss=0.0553, over 4694957.85 frames. ], batch size: 77, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:57:53,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 00:57:53,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:57:57,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:57,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:57:57,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:57:58,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:58:05,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:58:06,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:08,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:09,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:58:09,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:11,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:58:14,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:14,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:58:14,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:58:18,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:18,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 00:58:23,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:25,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:58:28,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:28,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:28,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:29,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:58:29,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:29,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:58:29,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:58:31,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 00:58:32,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:58:32,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:34,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:35,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:58:37,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:58:37,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=545680.0, ans=0.025 2023-09-30 00:58:38,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:39,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:58:40,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:40,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 00:58:40,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:58:44,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:44,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:46,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:47,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:58:49,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:51,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:51,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 00:58:52,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 00:58:52,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:58:54,096 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 00:58:54,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:54,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:58:55,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 00:58:55,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:58:55,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 00:58:56,436 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 00:58:56,437 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 00:58:56,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=545813.3333333334, ans=0.125 2023-09-30 00:58:57,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 00:58:59,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:00,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:59:01,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:59:02,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:03,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:59:04,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:05,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:07,594 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=545813.3333333334, ans=0.1 2023-09-30 00:59:13,267 INFO [train.py:1039] (0/4) Epoch 16, batch 2200, loss[loss=0.1806, simple_loss=0.2614, pruned_loss=0.0499, over 24643.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2569, pruned_loss=0.05518, over 4698181.48 frames. ], batch size: 65, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:59:13,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:59:13,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 00:59:17,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:59:24,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:24,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:59:24,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:59:25,922 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.877e+02 2.145e+02 2.548e+02 4.503e+02, threshold=4.290e+02, percent-clipped=1.0 2023-09-30 00:59:26,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:59:27,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:29,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:59:29,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 00:59:33,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 00:59:36,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:59:41,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=545946.6666666666, ans=0.1 2023-09-30 00:59:42,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 00:59:44,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:45,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:59:45,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:59:49,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:59:50,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 00:59:53,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:59:55,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:56,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:59:59,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:59:59,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=546013.3333333334, ans=0.0 2023-09-30 00:59:59,973 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=15.0 2023-09-30 01:00:00,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:03,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:00:04,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:07,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 01:00:09,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:09,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 01:00:10,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:12,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:00:12,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:13,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:00:13,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:13,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:00:16,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:00:18,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:00:18,950 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=546146.6666666666, ans=0.125 2023-09-30 01:00:22,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 01:00:23,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:00:25,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:00:28,093 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 01:00:29,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:00:31,166 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 01:00:32,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:00:32,844 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 01:00:34,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:34,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:00:37,299 INFO [train.py:1039] (0/4) Epoch 16, batch 2250, loss[loss=0.1874, simple_loss=0.2614, pruned_loss=0.05663, over 22125.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2573, pruned_loss=0.05564, over 4699273.14 frames. ], batch size: 48, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 01:00:37,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:39,534 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 01:00:41,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:00:42,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:48,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:00:51,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:00:53,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:00:55,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:00:55,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:58,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 01:00:58,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:58,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:01:02,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 01:01:03,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:01:03,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:04,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:01:04,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=546280.0, ans=0.125 2023-09-30 01:01:08,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:11,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:01:11,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:01:12,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 01:01:14,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:15,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:01:20,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:21,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:24,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:01:24,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:01:26,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:27,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:01:32,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:01:32,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=546413.3333333334, ans=0.07 2023-09-30 01:01:34,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:01:40,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:01:40,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:01:40,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:01:50,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:01:52,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:01:52,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 01:01:52,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:52,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:01:55,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 01:01:57,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:01:57,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:59,840 INFO [train.py:1039] (0/4) Epoch 16, batch 2300, loss[loss=0.1828, simple_loss=0.2621, pruned_loss=0.05176, over 24524.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2589, pruned_loss=0.05618, over 4705509.35 frames. ], batch size: 66, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:02:05,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=546546.6666666666, ans=0.125 2023-09-30 01:02:06,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:06,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:02:08,687 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=546546.6666666666, ans=0.1 2023-09-30 01:02:10,580 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 01:02:11,887 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.854e+02 2.032e+02 2.223e+02 2.869e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 01:02:12,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:19,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:02:19,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:02:19,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:20,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:20,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 01:02:21,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:02:23,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:24,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:02:29,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:02:31,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:02:34,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:39,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:02:41,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:43,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:02:46,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:50,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:51,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:02:52,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:02:52,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 01:02:57,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:02:57,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:57,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:57,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:02:57,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:02:59,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:02:59,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:03:00,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 01:03:01,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:03:01,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:01,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 01:03:03,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=546746.6666666666, ans=0.125 2023-09-30 01:03:03,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=546746.6666666666, ans=0.07 2023-09-30 01:03:06,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:03:09,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:03:14,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:03:14,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:03:16,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:03:18,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:03:18,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:03:19,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:03:21,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 01:03:22,685 INFO [train.py:1039] (0/4) Epoch 16, batch 2350, loss[loss=0.1774, simple_loss=0.2505, pruned_loss=0.05217, over 19216.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.259, pruned_loss=0.05579, over 4710535.65 frames. ], batch size: 41, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:03:26,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:03:26,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 01:03:30,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 01:03:34,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:37,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:39,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:03:39,647 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=15.0 2023-09-30 01:03:40,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 01:03:40,979 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=546946.6666666666, ans=0.0 2023-09-30 01:03:41,298 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.90 vs. limit=12.0 2023-09-30 01:03:42,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=546946.6666666666, ans=0.1 2023-09-30 01:03:42,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=546946.6666666666, ans=0.1 2023-09-30 01:03:45,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:03:46,190 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=546946.6666666666, ans=0.125 2023-09-30 01:03:51,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 01:03:54,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:59,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:03:59,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:04:00,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:04:01,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 01:04:02,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:04:05,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:04:05,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:05,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:04:09,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:04:12,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 01:04:12,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:04:15,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:04:15,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:04:15,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=547080.0, ans=0.0 2023-09-30 01:04:15,523 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:04:16,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 01:04:18,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:04:22,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 01:04:22,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:04:27,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 01:04:31,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 01:04:32,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:32,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:04:32,691 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 01:04:32,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 01:04:36,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=547146.6666666666, ans=0.125 2023-09-30 01:04:37,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 01:04:40,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:04:43,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=547213.3333333334, ans=0.1 2023-09-30 01:04:44,706 INFO [train.py:1039] (0/4) Epoch 16, batch 2400, loss[loss=0.1972, simple_loss=0.2584, pruned_loss=0.06803, over 23769.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2584, pruned_loss=0.05607, over 4702422.15 frames. ], batch size: 179, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:04:44,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:04:48,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:04:51,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:04:53,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 01:04:53,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 01:04:56,007 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.435e+02 1.899e+02 2.114e+02 2.474e+02 3.602e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 01:05:00,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:05:00,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:00,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=547280.0, ans=10.0 2023-09-30 01:05:03,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 01:05:03,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:05:05,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:07,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 01:05:10,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:12,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 01:05:18,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:05:21,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 01:05:23,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:05:25,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:28,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:05:31,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 01:05:31,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:05:32,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=547346.6666666666, ans=0.95 2023-09-30 01:05:37,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=547413.3333333334, ans=0.0 2023-09-30 01:05:41,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:42,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:05:47,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:50,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:05:50,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:05:50,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:05:50,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:50,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:05:50,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:05:57,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:05:58,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:05:58,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 01:05:59,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 01:06:00,272 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=547480.0, ans=0.125 2023-09-30 01:06:01,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:06:01,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:06:01,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 01:06:03,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 01:06:03,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 01:06:03,326 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 01:06:04,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 01:06:06,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:06:08,318 INFO [train.py:1039] (0/4) Epoch 16, batch 2450, loss[loss=0.1673, simple_loss=0.2154, pruned_loss=0.05956, over 19157.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2569, pruned_loss=0.05564, over 4687748.84 frames. ], batch size: 388, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:06:08,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:08,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:10,609 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 01:06:10,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:12,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:06:15,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:06:15,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:19,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:19,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:20,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 01:06:22,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=547546.6666666666, ans=0.2 2023-09-30 01:06:24,519 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.48 vs. limit=6.0 2023-09-30 01:06:26,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:06:26,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:30,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:06:30,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:06:30,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:06:30,702 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=547613.3333333334, ans=10.0 2023-09-30 01:06:31,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 01:06:33,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:36,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:06:38,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:06:44,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:06:44,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:48,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 01:06:48,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:06:57,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:58,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:58,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:00,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:07:00,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:00,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:07:01,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 01:07:05,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:07:06,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:07:09,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:07:09,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:16,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:07:16,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 01:07:18,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:07:20,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:07:20,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 01:07:22,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:07:22,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:07:26,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:07:27,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:29,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:07:30,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 01:07:32,335 INFO [train.py:1039] (0/4) Epoch 16, batch 2500, loss[loss=0.1493, simple_loss=0.1982, pruned_loss=0.05018, over 18793.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2554, pruned_loss=0.05486, over 4692112.89 frames. ], batch size: 389, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:07:32,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:07:38,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:44,698 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.913e+02 2.172e+02 2.502e+02 3.550e+02, threshold=4.344e+02, percent-clipped=0.0 2023-09-30 01:07:47,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:07:47,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:49,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:49,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 01:07:56,502 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=547946.6666666666, ans=0.1 2023-09-30 01:07:57,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:07:59,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:01,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:08:01,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:08:01,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 01:08:04,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:04,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:06,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 01:08:06,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:07,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 01:08:07,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:12,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:08:14,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:17,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:08:17,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 01:08:18,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:08:20,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:24,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:29,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:31,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:37,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:08:40,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 01:08:40,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:40,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:08:42,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:08:42,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:08:43,684 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 01:08:43,685 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 01:08:43,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 01:08:48,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:50,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 01:08:50,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 01:08:51,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:51,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 01:08:54,998 INFO [train.py:1039] (0/4) Epoch 16, batch 2550, loss[loss=0.1931, simple_loss=0.2649, pruned_loss=0.06062, over 23705.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2561, pruned_loss=0.05513, over 4692783.31 frames. ], batch size: 232, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:08:55,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 01:08:58,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:00,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:09:02,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:09:03,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:05,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 01:09:06,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:09:09,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 01:09:10,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:09:12,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:15,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:09:15,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 01:09:17,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:17,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:18,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:20,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:09:20,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 01:09:21,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:09:21,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:21,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 01:09:33,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:09:41,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:41,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:41,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:43,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:09:51,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:54,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:54,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:09:54,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:09:54,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:09:54,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:09:57,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=548413.3333333334, ans=15.0 2023-09-30 01:09:58,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:58,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:03,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:10:03,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 01:10:03,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:10:05,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:06,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:10:06,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:10:09,167 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=548480.0, ans=0.125 2023-09-30 01:10:10,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:10,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=548480.0, ans=0.0 2023-09-30 01:10:16,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:10:18,237 INFO [train.py:1039] (0/4) Epoch 16, batch 2600, loss[loss=0.1948, simple_loss=0.2635, pruned_loss=0.06307, over 15462.00 frames. ], tot_loss[loss=0.184, simple_loss=0.257, pruned_loss=0.0555, over 4688524.49 frames. ], batch size: 33, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:10:19,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:24,933 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 01:10:25,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=548546.6666666666, ans=0.125 2023-09-30 01:10:26,539 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 01:10:26,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:10:26,624 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 01:10:28,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 01:10:28,168 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 01:10:28,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=548546.6666666666, ans=0.0 2023-09-30 01:10:31,178 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.861e+02 2.102e+02 2.275e+02 3.590e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-30 01:10:31,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:10:31,383 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 01:10:32,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 01:10:33,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=548613.3333333334, ans=0.2 2023-09-30 01:10:34,358 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 01:10:37,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:10:39,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=548613.3333333334, ans=0.125 2023-09-30 01:10:40,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 01:10:41,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 01:10:44,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:10:44,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 01:10:46,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=548613.3333333334, ans=0.2 2023-09-30 01:10:48,104 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 01:10:48,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 01:10:54,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:10:54,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:54,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:10:54,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 01:10:58,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:11:04,523 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 01:11:09,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:09,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:10,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 01:11:10,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:10,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:11:12,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 01:11:16,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:11:16,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:11:19,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,810 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 01:11:22,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:11:27,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:29,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:11:29,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 01:11:31,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:32,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:11:34,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:11:35,271 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.10 vs. limit=15.0 2023-09-30 01:11:40,236 INFO [train.py:1039] (0/4) Epoch 16, batch 2650, loss[loss=0.1834, simple_loss=0.2582, pruned_loss=0.05435, over 23387.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2583, pruned_loss=0.05584, over 4699809.45 frames. ], batch size: 119, lr: 6.46e-03, grad_scale: 4.0 2023-09-30 01:11:40,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 01:11:40,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:43,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:11:44,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=548880.0, ans=0.1 2023-09-30 01:11:48,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 01:11:48,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:49,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:11:49,113 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 01:11:51,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:11:54,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:55,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:11:56,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=548946.6666666666, ans=0.125 2023-09-30 01:11:57,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:12:00,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:12:02,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 01:12:02,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:12:02,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:12:05,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 01:12:07,434 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 01:12:10,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:12,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 01:12:13,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:13,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 01:12:17,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:17,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:12:17,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:18,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:18,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=549013.3333333334, ans=0.1 2023-09-30 01:12:20,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=549013.3333333334, ans=0.1 2023-09-30 01:12:23,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 01:12:25,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 01:12:28,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:12:32,219 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.70 vs. limit=15.0 2023-09-30 01:12:32,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 01:12:32,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:34,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:34,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:12:35,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:35,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:37,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:39,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:40,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:12:40,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:12:41,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=549080.0, ans=0.125 2023-09-30 01:12:42,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:12:44,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:44,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:12:46,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:47,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:49,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:12:52,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:52,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:12:52,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:53,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 01:12:57,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:59,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:03,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:03,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:04,888 INFO [train.py:1039] (0/4) Epoch 16, batch 2700, loss[loss=0.1778, simple_loss=0.2607, pruned_loss=0.04747, over 24477.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2598, pruned_loss=0.05629, over 4703055.13 frames. ], batch size: 69, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:13:06,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:13:06,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:08,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:08,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 01:13:11,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:13:11,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:13:16,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:13:16,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:16,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:18,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:13:18,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:13:18,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:13:18,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:13:18,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 01:13:19,482 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.941e+02 2.174e+02 2.573e+02 4.504e+02, threshold=4.348e+02, percent-clipped=1.0 2023-09-30 01:13:19,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:13:22,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:13:22,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:13:22,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:25,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:13:26,170 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=549280.0, ans=0.125 2023-09-30 01:13:27,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 01:13:28,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:13:34,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:13:34,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:13:40,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:13:40,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:40,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:13:41,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:13:43,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:13:48,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:13:48,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:13:48,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:13:51,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:51,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:13:58,965 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.46 vs. limit=6.0 2023-09-30 01:13:59,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:14:01,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:02,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:14:02,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:08,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:09,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:11,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:14:12,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:14,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:14,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:16,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:14:18,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:18,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:22,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 01:14:22,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:25,742 INFO [train.py:1039] (0/4) Epoch 16, batch 2750, loss[loss=0.1849, simple_loss=0.2633, pruned_loss=0.05326, over 24463.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2593, pruned_loss=0.05614, over 4700751.30 frames. ], batch size: 63, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:14:25,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:14:27,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 01:14:28,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 01:14:29,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:32,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:33,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:35,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:35,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:14:35,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=549546.6666666666, ans=0.5 2023-09-30 01:14:36,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:40,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:14:40,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:14:40,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:14:40,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:40,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 01:14:42,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:14:42,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:48,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 01:14:50,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:50,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:50,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:52,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:14:52,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:53,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:14:53,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:53,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:58,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:15:00,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:15:00,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:15:01,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:01,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:15:08,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:15:10,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:15:10,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:17,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:17,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:15:18,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:15:22,671 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.92 vs. limit=15.0 2023-09-30 01:15:23,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:15:25,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:15:25,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 01:15:30,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:32,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 01:15:35,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=549813.3333333334, ans=0.0 2023-09-30 01:15:37,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:15:39,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:15:40,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 01:15:41,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:15:42,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:15:42,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 01:15:44,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:15:47,514 INFO [train.py:1039] (0/4) Epoch 16, batch 2800, loss[loss=0.2002, simple_loss=0.2635, pruned_loss=0.06844, over 23817.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2581, pruned_loss=0.05601, over 4699827.07 frames. ], batch size: 164, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:15:47,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:15:47,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:15:47,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:15:49,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 01:15:49,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:51,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:52,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:54,422 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 01:15:54,423 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 01:15:57,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:57,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=549880.0, ans=0.0 2023-09-30 01:16:00,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:16:00,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:16:01,934 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.870e+02 2.056e+02 2.355e+02 4.086e+02, threshold=4.112e+02, percent-clipped=0.0 2023-09-30 01:16:02,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:16:04,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 01:16:07,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:16:08,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 01:16:10,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:10,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:16:10,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:13,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:15,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:15,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:16:16,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:26,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:16:28,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:16:29,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:31,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:16:31,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:36,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:36,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 01:16:36,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:38,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:38,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:16:45,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:45,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:47,889 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.84 vs. limit=22.5 2023-09-30 01:16:48,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:50,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:16:50,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:50,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:16:51,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:16:51,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:16:54,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:54,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 01:16:54,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:56,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:56,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:58,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 01:16:59,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:01,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:17:01,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:17:04,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 01:17:10,298 INFO [train.py:1039] (0/4) Epoch 16, batch 2850, loss[loss=0.183, simple_loss=0.2509, pruned_loss=0.05759, over 23390.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2569, pruned_loss=0.05545, over 4706655.49 frames. ], batch size: 105, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:17:10,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:17:10,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:17:10,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:17:12,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:15,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:15,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:17:17,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:17:19,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:20,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:17:22,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:17:23,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 01:17:29,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 01:17:31,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:31,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 01:17:33,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:33,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=550280.0, ans=0.125 2023-09-30 01:17:36,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 01:17:37,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 01:17:39,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:41,879 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.53 vs. limit=22.5 2023-09-30 01:17:50,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:52,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:17:53,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:55,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:17:55,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:17:55,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:17:56,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:17:58,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 01:17:59,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:17:59,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:01,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:01,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:05,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:05,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:07,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:09,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:18:10,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:18:12,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:14,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:15,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:18:18,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:18:21,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 01:18:22,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 01:18:24,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:18:24,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 01:18:25,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:18:25,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:27,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:18:27,272 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 01:18:27,331 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 01:18:27,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:27,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:33,168 INFO [train.py:1039] (0/4) Epoch 16, batch 2900, loss[loss=0.2034, simple_loss=0.2727, pruned_loss=0.06702, over 23880.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2569, pruned_loss=0.05518, over 4703444.49 frames. ], batch size: 195, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:18:33,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:18:34,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:34,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:37,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 01:18:37,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=550546.6666666666, ans=0.125 2023-09-30 01:18:41,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:42,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 01:18:43,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 01:18:45,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:18:46,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:18:46,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:47,231 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=550546.6666666666, ans=0.2 2023-09-30 01:18:48,255 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.959e+02 2.363e+02 2.831e+02 4.091e+02, threshold=4.726e+02, percent-clipped=0.0 2023-09-30 01:18:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:51,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:52,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:56,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:18:56,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 01:18:56,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:18:57,707 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.60 vs. limit=15.0 2023-09-30 01:18:58,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:01,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 01:19:03,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 01:19:04,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:19:04,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 01:19:04,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:19:07,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:19:07,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:19:09,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:19:11,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:13,830 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.25 vs. limit=22.5 2023-09-30 01:19:14,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:19:19,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:21,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 01:19:21,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 01:19:21,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:19:27,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:19:28,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 01:19:29,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:19:35,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:37,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=550813.3333333334, ans=0.025 2023-09-30 01:19:45,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:19:45,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:19:47,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 01:19:50,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:52,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 01:19:52,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:19:52,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:19:55,411 INFO [train.py:1039] (0/4) Epoch 16, batch 2950, loss[loss=0.1803, simple_loss=0.2698, pruned_loss=0.04545, over 24644.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2579, pruned_loss=0.05479, over 4712589.50 frames. ], batch size: 73, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:19:59,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:20:00,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 01:20:01,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=550880.0, ans=0.1 2023-09-30 01:20:02,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:02,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:04,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:06,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:20:07,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 01:20:07,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 01:20:07,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:20:07,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:15,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:19,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:20,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:20:20,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:24,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:20:24,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:20:27,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:20:29,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 01:20:34,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 01:20:34,674 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 01:20:36,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:20:37,765 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 01:20:39,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 01:20:39,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:41,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:41,173 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 01:20:41,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:20:42,035 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.34 vs. limit=12.0 2023-09-30 01:20:44,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 01:20:45,235 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.77 vs. limit=22.5 2023-09-30 01:20:45,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:45,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:20:46,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=551080.0, ans=0.2 2023-09-30 01:20:47,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:49,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:20:49,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:49,188 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 01:20:51,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:51,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 01:20:58,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:59,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:20:59,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 01:20:59,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:21:01,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 01:21:06,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:08,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:21:09,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:21:12,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:21:12,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:21:14,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:21:14,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:16,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:21:16,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:21:17,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:19,001 INFO [train.py:1039] (0/4) Epoch 16, batch 3000, loss[loss=0.1871, simple_loss=0.2528, pruned_loss=0.06065, over 23730.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2585, pruned_loss=0.05501, over 4722722.49 frames. ], batch size: 149, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:21:19,002 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 01:21:30,869 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.3.encoder.layers.3.self_attn_weights, attn_weights_entropy = tensor([3.7544, 3.5565, 2.4787, 3.6524, 2.5887, 3.0157, 3.5728, 3.4720], device='cuda:0') 2023-09-30 01:21:34,547 INFO [train.py:1071] (0/4) Epoch 16, validation: loss=0.3091, simple_loss=0.2818, pruned_loss=0.1682, over 1125622.00 frames. 2023-09-30 01:21:34,548 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 01:21:34,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:21:36,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:36,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 01:21:37,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:39,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:21:39,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:21:39,781 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=551213.3333333334, ans=0.2 2023-09-30 01:21:43,125 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 01:21:44,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 01:21:47,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:21:47,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:21:47,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 01:21:47,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:21:47,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=551213.3333333334, ans=0.125 2023-09-30 01:21:49,612 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.865e+02 2.031e+02 2.277e+02 3.298e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 01:21:54,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=551280.0, ans=0.125 2023-09-30 01:21:55,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:22:05,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:22:13,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 01:22:14,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:22:16,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:22:16,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:22:18,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:22:20,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:20,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 01:22:23,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 01:22:24,431 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-09-30 01:22:25,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:22:25,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:22:27,555 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=551413.3333333334, ans=0.1 2023-09-30 01:22:28,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:22:30,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:31,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:22:35,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:22:35,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:35,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:22:38,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:40,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 01:22:42,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:22:43,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:22:43,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:22:45,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:47,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:48,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:22:48,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 01:22:49,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:22:49,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 01:22:50,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:22:52,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 01:22:53,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:22:55,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:22:55,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 01:22:57,404 INFO [train.py:1039] (0/4) Epoch 16, batch 3050, loss[loss=0.1872, simple_loss=0.2705, pruned_loss=0.052, over 24272.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2594, pruned_loss=0.05532, over 4728055.20 frames. ], batch size: 74, lr: 6.45e-03, grad_scale: 8.0 2023-09-30 01:22:57,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 01:22:57,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:22:59,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:23:01,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:23:01,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:23:01,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:01,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:23:04,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 01:23:05,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:08,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:09,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=551546.6666666666, ans=0.1 2023-09-30 01:23:10,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:23:12,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=551613.3333333334, ans=0.125 2023-09-30 01:23:15,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 01:23:23,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 01:23:23,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 01:23:23,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:29,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:23:31,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:33,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:33,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:37,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:23:38,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:23:38,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:38,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:38,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:40,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:41,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:45,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:45,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 01:23:45,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:46,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:23:50,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:50,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:23:51,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:23:51,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:23:53,969 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=551746.6666666666, ans=0.125 2023-09-30 01:23:56,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:58,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:03,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:05,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:05,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:06,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:08,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:24:08,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:24:08,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 01:24:10,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:10,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:10,840 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:24:12,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 01:24:15,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:16,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=551813.3333333334, ans=0.2 2023-09-30 01:24:20,247 INFO [train.py:1039] (0/4) Epoch 16, batch 3100, loss[loss=0.1692, simple_loss=0.253, pruned_loss=0.04268, over 24491.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2605, pruned_loss=0.0561, over 4720242.24 frames. ], batch size: 66, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:24:20,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:22,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:24:25,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:24:26,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 01:24:30,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 01:24:31,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 01:24:33,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:24:35,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:24:36,434 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.437e+02 1.867e+02 2.041e+02 2.309e+02 3.619e+02, threshold=4.081e+02, percent-clipped=0.0 2023-09-30 01:24:36,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:38,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:24:40,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=551946.6666666666, ans=0.1 2023-09-30 01:24:43,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:48,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=551946.6666666666, ans=0.125 2023-09-30 01:24:49,295 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.59 vs. limit=22.5 2023-09-30 01:24:50,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 01:24:54,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:24:55,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:56,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:24:56,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:56,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:24:59,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:24:59,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 01:24:59,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:25:00,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:02,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 01:25:03,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:06,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:25:07,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 01:25:08,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 01:25:10,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:11,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:13,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:13,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:13,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:25:15,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:25:15,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:25:18,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:25:18,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:25:18,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:18,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:25:23,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:25:23,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 01:25:25,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:25:27,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 01:25:27,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:27,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:27,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 01:25:30,600 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=552146.6666666666, ans=0.1 2023-09-30 01:25:40,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 01:25:40,786 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.43 vs. limit=15.0 2023-09-30 01:25:41,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:43,536 INFO [train.py:1039] (0/4) Epoch 16, batch 3150, loss[loss=0.1566, simple_loss=0.2351, pruned_loss=0.03901, over 24452.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2585, pruned_loss=0.05547, over 4724169.28 frames. ], batch size: 58, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:25:43,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:45,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:25:45,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:25:46,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 01:25:46,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:47,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:25:49,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 01:25:50,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:52,343 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 01:25:56,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 01:25:56,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:56,980 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 01:25:59,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:25:59,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 01:26:00,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 01:26:00,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 01:26:00,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:00,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:02,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:02,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=552280.0, ans=0.0 2023-09-30 01:26:05,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 01:26:05,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:05,871 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=552280.0, ans=0.125 2023-09-30 01:26:07,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:07,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:09,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:26:13,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 01:26:15,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:26:18,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:26:19,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:20,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 01:26:22,326 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=552346.6666666666, ans=0.125 2023-09-30 01:26:23,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 01:26:25,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:26:26,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:26:27,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:26:27,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:27,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:26:28,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:26:28,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:26:30,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 01:26:31,010 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.56 vs. limit=15.0 2023-09-30 01:26:31,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:26:31,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:33,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:26:33,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:35,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 01:26:35,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:38,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 01:26:38,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:39,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 01:26:42,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 01:26:43,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:26:43,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:45,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 01:26:45,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:26:46,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:50,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:26:51,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:51,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:26:58,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:26:59,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:00,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 01:27:07,195 INFO [train.py:1039] (0/4) Epoch 16, batch 3200, loss[loss=0.1786, simple_loss=0.2482, pruned_loss=0.05452, over 23732.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2585, pruned_loss=0.05505, over 4734575.58 frames. ], batch size: 149, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:27:07,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:27:07,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:27:07,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=552546.6666666666, ans=0.1 2023-09-30 01:27:10,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:11,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:27:11,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 01:27:15,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:27:15,872 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.24 vs. limit=15.0 2023-09-30 01:27:19,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:27:23,443 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.941e+02 2.200e+02 2.639e+02 4.791e+02, threshold=4.401e+02, percent-clipped=2.0 2023-09-30 01:27:23,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:32,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:27:39,920 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.85 vs. limit=15.0 2023-09-30 01:27:43,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 01:27:44,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:27:47,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 01:27:48,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:27:51,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:27:51,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:27:53,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:27:56,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 01:27:58,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:28:01,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 01:28:03,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 01:28:05,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:28:07,711 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=552746.6666666666, ans=0.125 2023-09-30 01:28:07,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=552746.6666666666, ans=0.125 2023-09-30 01:28:11,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:11,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:28:11,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:13,417 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 01:28:13,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:28:17,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=552813.3333333334, ans=0.125 2023-09-30 01:28:18,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:20,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 01:28:20,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 01:28:20,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 01:28:22,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=552813.3333333334, ans=0.125 2023-09-30 01:28:23,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 01:28:25,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:28:27,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:28:27,194 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 01:28:27,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:28:27,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:28,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=552813.3333333334, ans=0.1 2023-09-30 01:28:30,168 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 01:28:31,904 INFO [train.py:1039] (0/4) Epoch 16, batch 3250, loss[loss=0.1897, simple_loss=0.2788, pruned_loss=0.05026, over 24309.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2583, pruned_loss=0.05547, over 4711425.70 frames. ], batch size: 74, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:28:32,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:28:35,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:28:44,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=552880.0, ans=0.0 2023-09-30 01:28:45,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:28:45,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 01:28:48,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:49,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:49,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:28:50,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:28:50,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:28:53,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:53,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:28:55,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:28:55,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:28:58,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:00,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:29:02,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:03,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:29:05,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:05,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:29:05,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:07,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=22.5 2023-09-30 01:29:12,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 01:29:12,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:29:12,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:29:13,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:15,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:29:22,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:29:30,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:30,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:30,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 01:29:30,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:29:30,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:29:31,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:34,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 01:29:34,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=553080.0, ans=0.0 2023-09-30 01:29:35,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 01:29:35,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:37,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:38,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:40,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:29:40,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:44,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:29:44,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:29:46,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 01:29:46,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:29:49,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:29:49,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 01:29:52,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:52,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 01:29:54,959 INFO [train.py:1039] (0/4) Epoch 16, batch 3300, loss[loss=0.1825, simple_loss=0.2673, pruned_loss=0.04888, over 24643.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.259, pruned_loss=0.05564, over 4719178.87 frames. ], batch size: 73, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:29:55,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 01:29:56,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 01:29:56,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:01,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:30:02,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:30:02,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:05,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:30:05,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:30:06,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:08,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:30:11,913 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.919e+02 2.093e+02 2.361e+02 4.091e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 01:30:12,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 01:30:13,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:13,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:15,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=553280.0, ans=0.0 2023-09-30 01:30:16,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:16,551 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 01:30:18,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:30:19,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:30:19,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:30:19,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:30:19,616 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 01:30:25,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:25,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:30:28,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:28,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 01:30:30,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 01:30:30,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:31,094 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=553346.6666666666, ans=0.2 2023-09-30 01:30:32,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:30:32,309 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=553346.6666666666, ans=0.0 2023-09-30 01:30:33,649 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 01:30:35,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 01:30:35,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:30:39,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 01:30:40,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:30:44,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:30:45,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:30:49,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:50,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:50,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:50,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:30:53,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:30:53,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:53,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:30:55,199 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 01:30:56,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 01:30:59,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:31:00,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:00,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:31:03,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:31:05,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:05,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:31:06,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:08,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:31:10,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 01:31:12,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:14,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:15,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:31:15,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:31:15,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:19,228 INFO [train.py:1039] (0/4) Epoch 16, batch 3350, loss[loss=0.1883, simple_loss=0.2734, pruned_loss=0.05162, over 24567.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2599, pruned_loss=0.056, over 4727473.20 frames. ], batch size: 71, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:31:19,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:19,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:21,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:31:22,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:24,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:31:27,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:27,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=553546.6666666666, ans=0.125 2023-09-30 01:31:28,164 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.59 vs. limit=15.0 2023-09-30 01:31:28,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:31:30,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:31,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:31:33,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 01:31:34,769 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 01:31:34,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:38,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 01:31:38,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 01:31:39,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:31:40,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:31:40,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:41,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 01:31:42,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:42,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:31:45,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:47,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:47,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:49,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:31:50,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:55,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:55,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:58,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:59,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:01,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:01,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:03,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:06,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 01:32:06,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:32:07,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 01:32:07,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:32:09,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 01:32:10,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:12,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:13,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=553746.6666666666, ans=0.1 2023-09-30 01:32:20,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:21,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 01:32:23,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:25,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:32:25,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:32:30,664 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-09-30 01:32:31,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:33,043 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=553813.3333333334, ans=0.125 2023-09-30 01:32:34,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 01:32:34,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:32:34,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:32:37,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:38,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 01:32:38,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:38,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 01:32:40,381 INFO [train.py:1039] (0/4) Epoch 16, batch 3400, loss[loss=0.1999, simple_loss=0.2605, pruned_loss=0.06961, over 23382.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2603, pruned_loss=0.05637, over 4728623.14 frames. ], batch size: 285, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:32:40,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:40,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:42,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:32:42,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:32:42,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 01:32:44,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=553880.0, ans=0.0 2023-09-30 01:32:46,024 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-09-30 01:32:46,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 01:32:46,917 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 01:32:46,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:49,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=553880.0, ans=0.125 2023-09-30 01:32:52,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:52,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:54,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:32:56,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:32:58,066 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.849e+02 2.111e+02 2.348e+02 3.492e+02, threshold=4.221e+02, percent-clipped=0.0 2023-09-30 01:33:01,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:04,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 01:33:10,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:33:11,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:12,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:13,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:33:19,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:33:25,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 01:33:27,680 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.16 vs. limit=15.0 2023-09-30 01:33:32,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:32,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:33,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 01:33:33,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:33:35,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:36,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:36,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:33:40,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:43,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:33:43,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:33:51,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:33:52,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 01:33:56,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:34:01,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 01:34:03,390 INFO [train.py:1039] (0/4) Epoch 16, batch 3450, loss[loss=0.1941, simple_loss=0.2718, pruned_loss=0.05821, over 23520.00 frames. ], tot_loss[loss=0.187, simple_loss=0.261, pruned_loss=0.05652, over 4725767.32 frames. ], batch size: 93, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:34:07,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 01:34:07,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:09,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:34:09,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 01:34:10,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:34:15,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:34:18,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=554280.0, ans=0.0 2023-09-30 01:34:21,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:34:21,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:21,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:34:21,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:24,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:29,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 01:34:36,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 01:34:36,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:34:36,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:34:38,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:44,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 01:34:44,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:34:44,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=554346.6666666666, ans=10.0 2023-09-30 01:34:47,844 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=554346.6666666666, ans=0.0 2023-09-30 01:34:49,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:34:49,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:50,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:34:52,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:34:53,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 01:34:53,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:34:55,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:59,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:02,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 01:35:05,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:35:06,404 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.99 vs. limit=6.0 2023-09-30 01:35:12,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:35:13,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:17,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:22,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:22,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:35:22,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:35:22,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:35:25,327 INFO [train.py:1039] (0/4) Epoch 16, batch 3500, loss[loss=0.1726, simple_loss=0.2397, pruned_loss=0.05277, over 23683.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2593, pruned_loss=0.05578, over 4730325.98 frames. ], batch size: 232, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:35:27,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:28,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=554546.6666666666, ans=0.1 2023-09-30 01:35:30,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:35:30,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 01:35:34,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:35:36,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:35:39,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:39,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 01:35:43,267 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.893e+02 2.078e+02 2.352e+02 3.454e+02, threshold=4.155e+02, percent-clipped=0.0 2023-09-30 01:35:45,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:35:47,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:48,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:35:48,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:35:49,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:35:49,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:51,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:35:51,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 01:35:53,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:53,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:35:55,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=554613.3333333334, ans=0.0 2023-09-30 01:35:56,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:36:00,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:00,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 01:36:00,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:36:01,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=554680.0, ans=0.125 2023-09-30 01:36:04,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:36:05,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:36:05,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:08,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:36:08,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:11,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 01:36:13,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 01:36:13,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 01:36:13,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:16,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:16,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:17,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:36:22,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:36:22,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:36:26,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:36:28,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 01:36:28,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 01:36:28,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:36:31,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:32,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:34,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:37,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 01:36:37,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:40,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:41,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 01:36:43,062 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.24 vs. limit=22.5 2023-09-30 01:36:43,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 01:36:46,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:48,093 INFO [train.py:1039] (0/4) Epoch 16, batch 3550, loss[loss=0.1644, simple_loss=0.2469, pruned_loss=0.04095, over 24319.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2573, pruned_loss=0.05517, over 4728025.82 frames. ], batch size: 61, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:36:48,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:48,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:36:48,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:36:52,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:36:59,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:01,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:37:01,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:03,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:37:03,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:04,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:37:04,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:37:09,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:09,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:37:10,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:10,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:37:11,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=554946.6666666666, ans=0.125 2023-09-30 01:37:12,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:37:14,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=554946.6666666666, ans=0.125 2023-09-30 01:37:18,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:37:18,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:20,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:20,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:21,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:37:21,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 01:37:21,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:23,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:24,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:37:31,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:33,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:33,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:33,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=555013.3333333334, ans=0.125 2023-09-30 01:37:35,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 01:37:36,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:37:38,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 01:37:39,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:41,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:37:41,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:37:44,625 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 01:37:44,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:50,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:52,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 01:37:52,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:37:58,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:58,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 01:38:07,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=555146.6666666666, ans=0.0 2023-09-30 01:38:08,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 01:38:10,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:10,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:38:11,647 INFO [train.py:1039] (0/4) Epoch 16, batch 3600, loss[loss=0.1704, simple_loss=0.2491, pruned_loss=0.04588, over 24398.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2563, pruned_loss=0.05534, over 4708823.42 frames. ], batch size: 58, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:38:11,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:13,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:14,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:38:18,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:19,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:21,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:38:22,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:38:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:22,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 01:38:25,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:38:27,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:28,709 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 2.004e+02 2.343e+02 2.780e+02 3.954e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-30 01:38:31,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:34,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=555280.0, ans=0.1 2023-09-30 01:38:35,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:37,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:38:39,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:39,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 01:38:39,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:42,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:43,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:38:43,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:38:47,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:47,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:38:48,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 01:38:53,644 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:38:54,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:56,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:38:56,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 01:38:56,725 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:39:01,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:39:06,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:09,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:15,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:39:15,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:39:15,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 01:39:17,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 01:39:17,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 01:39:20,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:39:20,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:39:22,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 01:39:23,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:23,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:39:23,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:25,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 01:39:26,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 01:39:29,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:30,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 01:39:32,475 INFO [train.py:1039] (0/4) Epoch 16, batch 3650, loss[loss=0.1646, simple_loss=0.249, pruned_loss=0.04007, over 24448.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2566, pruned_loss=0.05547, over 4713865.27 frames. ], batch size: 69, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:39:36,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 01:39:38,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:39:45,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 01:39:46,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 01:39:49,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:39:49,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:39:49,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:39:55,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:39:55,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:56,363 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.51 vs. limit=22.5 2023-09-30 01:39:57,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 01:39:57,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:39:57,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:57,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 01:39:59,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:40:00,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:00,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:00,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:40:03,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 01:40:06,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 01:40:06,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:40:08,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 01:40:10,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:10,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:40:14,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:40:17,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:17,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:40:19,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:40:20,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:40:20,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:40:25,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:27,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:27,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:28,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:40:28,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:30,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:36,434 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 01:40:39,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:41,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:41,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:40:42,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:42,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:40:44,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:44,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 01:40:46,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:49,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:40:53,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:53,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:40:54,842 INFO [train.py:1039] (0/4) Epoch 16, batch 3700, loss[loss=0.1993, simple_loss=0.2816, pruned_loss=0.05848, over 24672.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2581, pruned_loss=0.05586, over 4714703.92 frames. ], batch size: 73, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:40:55,632 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.09 vs. limit=15.0 2023-09-30 01:40:56,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:56,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 01:40:56,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:58,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:40:59,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:41:01,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:41:05,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:05,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:08,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:41:08,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:41:08,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:41:11,786 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.928e+02 2.143e+02 2.451e+02 3.411e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 01:41:11,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:14,976 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 01:41:21,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:41:23,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:41:23,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:41:23,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 01:41:23,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:29,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:31,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 01:41:32,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:34,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:41:37,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:37,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:41:39,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:41:43,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:43,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 01:41:44,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:45,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 01:41:48,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:41:48,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:41:51,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:53,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 01:41:54,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:41:54,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:41:54,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:41:55,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:58,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:42:01,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 01:42:02,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 01:42:04,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:42:04,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:05,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:42:07,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:42:10,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:42:11,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:42:11,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:42:13,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 01:42:16,450 INFO [train.py:1039] (0/4) Epoch 16, batch 3750, loss[loss=0.1849, simple_loss=0.2546, pruned_loss=0.05757, over 23540.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2585, pruned_loss=0.05652, over 4702382.68 frames. ], batch size: 120, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:42:16,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:42:16,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=556213.3333333334, ans=0.125 2023-09-30 01:42:19,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:42:21,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 01:42:21,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:42:22,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:24,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:25,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:42:27,442 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=556213.3333333334, ans=0.125 2023-09-30 01:42:30,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:36,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:42:36,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:42:39,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:42:44,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:42:44,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 01:42:44,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=556280.0, ans=0.0 2023-09-30 01:42:46,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:42:47,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:42:49,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:53,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 01:42:56,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 01:42:59,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:42:59,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:43:01,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:02,268 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.73 vs. limit=15.0 2023-09-30 01:43:06,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:08,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:43:08,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=556413.3333333334, ans=0.125 2023-09-30 01:43:12,242 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=556413.3333333334, ans=0.125 2023-09-30 01:43:13,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 01:43:16,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:20,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:43:22,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:43:25,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:43:28,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:43:30,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:43:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:43:34,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:43:36,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:43:37,529 INFO [train.py:1039] (0/4) Epoch 16, batch 3800, loss[loss=0.1899, simple_loss=0.2743, pruned_loss=0.05273, over 24360.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2586, pruned_loss=0.05624, over 4719983.56 frames. ], batch size: 77, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:43:43,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:43:44,198 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:43:49,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:49,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:43:49,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 01:43:53,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:53,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:43:53,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=556613.3333333334, ans=0.1 2023-09-30 01:43:55,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:43:56,511 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.836e+02 2.013e+02 2.222e+02 3.108e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 01:43:56,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 01:43:56,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:58,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:43:58,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:59,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:43:59,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:01,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 01:44:04,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 01:44:06,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:44:07,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:10,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:44:12,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:44:12,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:44:12,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:17,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:17,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:23,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:44:23,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 01:44:25,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:28,682 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:44:32,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:44:39,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:44:40,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 01:44:42,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=556813.3333333334, ans=0.125 2023-09-30 01:44:43,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 01:44:45,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:45,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:46,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:48,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 01:44:53,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 01:44:53,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 01:44:53,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:54,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=556813.3333333334, ans=0.0 2023-09-30 01:44:55,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:45:01,145 INFO [train.py:1039] (0/4) Epoch 16, batch 3850, loss[loss=0.1678, simple_loss=0.2413, pruned_loss=0.0472, over 23331.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2568, pruned_loss=0.0558, over 4713683.84 frames. ], batch size: 119, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:45:02,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:45:02,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:45:07,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=556880.0, ans=0.125 2023-09-30 01:45:08,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:45:10,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 01:45:10,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:45:12,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:15,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:45:18,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:19,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:45:21,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 01:45:27,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=556946.6666666666, ans=0.2 2023-09-30 01:45:27,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=556946.6666666666, ans=0.0 2023-09-30 01:45:28,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:30,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:34,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:35,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:45:38,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:40,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:45:40,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:40,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:45:40,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=557013.3333333334, ans=0.0 2023-09-30 01:45:41,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:44,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:45:44,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 01:45:45,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 01:45:46,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:46,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:49,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 01:45:52,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 01:45:54,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:56,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 01:45:59,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:46:01,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=557080.0, ans=0.1 2023-09-30 01:46:04,088 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.61 vs. limit=10.0 2023-09-30 01:46:04,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:07,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:46:08,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=557146.6666666666, ans=0.125 2023-09-30 01:46:10,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:10,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 01:46:13,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 01:46:15,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:16,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:19,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:46:19,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:46:19,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:46:21,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 01:46:22,642 INFO [train.py:1039] (0/4) Epoch 16, batch 3900, loss[loss=0.1768, simple_loss=0.2192, pruned_loss=0.06715, over 19239.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2568, pruned_loss=0.05552, over 4714747.40 frames. ], batch size: 388, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:46:22,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:46:24,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 01:46:25,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:25,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:27,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:46:27,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:27,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:46:28,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:28,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:29,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:30,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 01:46:30,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:34,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:36,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:38,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:46:38,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:42,007 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.908e+02 2.170e+02 2.558e+02 5.090e+02, threshold=4.341e+02, percent-clipped=1.0 2023-09-30 01:46:42,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:42,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:42,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:46:45,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 01:46:45,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:46:46,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 01:46:46,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:48,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 01:46:48,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 01:46:53,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:46:53,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:53,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:46:55,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:46:58,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:47:01,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:47:03,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:47:03,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:05,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:47:09,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=557346.6666666666, ans=0.0 2023-09-30 01:47:12,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:12,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:47:19,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:47:21,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:47:28,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=557480.0, ans=0.0 2023-09-30 01:47:29,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=557480.0, ans=0.0 2023-09-30 01:47:33,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:47:34,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:36,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 01:47:36,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 01:47:37,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:40,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 01:47:42,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:43,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 01:47:46,591 INFO [train.py:1039] (0/4) Epoch 16, batch 3950, loss[loss=0.2019, simple_loss=0.272, pruned_loss=0.06586, over 23267.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2568, pruned_loss=0.05532, over 4719956.25 frames. ], batch size: 105, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:47:52,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:53,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 01:47:53,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:47:58,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:48:00,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:48:05,173 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 01:48:05,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=557613.3333333334, ans=0.0 2023-09-30 01:48:06,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:06,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 01:48:06,770 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 01:48:08,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:09,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:09,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:48:09,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:10,186 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=557613.3333333334, ans=0.1 2023-09-30 01:48:11,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 01:48:13,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=557613.3333333334, ans=0.125 2023-09-30 01:48:14,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:48:16,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:16,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:48:17,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:48:18,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:48:20,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=557680.0, ans=0.05 2023-09-30 01:48:28,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=557680.0, ans=0.125 2023-09-30 01:48:31,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:48:31,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:48:32,122 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=557680.0, ans=0.125 2023-09-30 01:48:36,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 01:48:42,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 01:48:42,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 01:48:43,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:48:44,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=557746.6666666666, ans=0.125 2023-09-30 01:48:46,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:48:46,733 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.36 vs. limit=10.0 2023-09-30 01:48:53,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:48:53,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:48:53,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:53,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:48:55,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 01:49:00,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:49:00,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:49:05,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=557813.3333333334, ans=0.125 2023-09-30 01:49:07,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 01:49:10,294 INFO [train.py:1039] (0/4) Epoch 16, batch 4000, loss[loss=0.171, simple_loss=0.2403, pruned_loss=0.05086, over 23622.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.257, pruned_loss=0.05522, over 4724270.00 frames. ], batch size: 135, lr: 6.41e-03, grad_scale: 32.0 2023-09-30 01:49:15,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:18,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=557880.0, ans=0.0 2023-09-30 01:49:20,065 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=557880.0, ans=0.125 2023-09-30 01:49:21,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:27,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:28,508 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.918e+02 2.123e+02 2.513e+02 3.159e+02, threshold=4.246e+02, percent-clipped=0.0 2023-09-30 01:49:28,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:49:28,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:28,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 01:49:30,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:49:31,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 01:49:31,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:49:31,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 01:49:33,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:37,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:49:37,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:49:37,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:49:37,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:37,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:49:40,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:49:40,209 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 01:49:40,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:49:42,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:49:43,882 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 01:49:45,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:49:45,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:49:51,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 01:49:53,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:55,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:49:57,447 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 01:49:59,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:49:59,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 01:49:59,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:00,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:02,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:50:03,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:50:03,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:50:03,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:50:06,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 01:50:07,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:09,119 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:50:10,273 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 01:50:15,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:50:18,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:50:20,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:50:21,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:22,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:50:23,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:25,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=558146.6666666666, ans=0.1 2023-09-30 01:50:28,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:31,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:50:31,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 01:50:32,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=558213.3333333334, ans=0.02 2023-09-30 01:50:33,778 INFO [train.py:1039] (0/4) Epoch 16, batch 4050, loss[loss=0.1701, simple_loss=0.2528, pruned_loss=0.04369, over 24511.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2582, pruned_loss=0.0553, over 4726951.39 frames. ], batch size: 63, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:50:35,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:50:35,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:50:36,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:50:37,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:38,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:41,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=558213.3333333334, ans=0.0 2023-09-30 01:50:43,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:45,611 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=558213.3333333334, ans=0.125 2023-09-30 01:50:46,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:50:48,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:50:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:50:51,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:53,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:55,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:55,519 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=558280.0, ans=0.0 2023-09-30 01:50:58,903 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-09-30 01:50:59,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 01:51:03,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 01:51:03,250 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 01:51:06,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:51:07,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=558346.6666666666, ans=0.125 2023-09-30 01:51:14,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 01:51:15,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:19,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:19,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=558346.6666666666, ans=0.0 2023-09-30 01:51:22,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:51:24,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:51:24,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:25,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:51:30,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 01:51:30,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:51:32,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:32,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=558413.3333333334, ans=0.0 2023-09-30 01:51:34,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 01:51:34,795 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.11 vs. limit=15.0 2023-09-30 01:51:40,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:40,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=558480.0, ans=0.0 2023-09-30 01:51:47,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 01:51:48,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:48,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:51:50,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 01:51:50,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 01:51:50,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:51:54,144 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.10 vs. limit=15.0 2023-09-30 01:51:54,856 INFO [train.py:1039] (0/4) Epoch 16, batch 4100, loss[loss=0.1683, simple_loss=0.2426, pruned_loss=0.04705, over 24650.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2589, pruned_loss=0.05568, over 4720137.22 frames. ], batch size: 60, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:51:55,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:51:55,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:51:55,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:52:03,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 01:52:06,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 01:52:06,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=558546.6666666666, ans=0.125 2023-09-30 01:52:07,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 01:52:09,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 01:52:09,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:11,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:52:13,100 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 01:52:16,681 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.940e+02 2.145e+02 2.448e+02 4.292e+02, threshold=4.289e+02, percent-clipped=1.0 2023-09-30 01:52:16,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:18,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:52:18,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:20,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:52:23,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:52:24,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:24,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:52:25,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 01:52:25,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:25,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:52:25,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:25,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:52:27,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 01:52:29,431 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.20 vs. limit=15.0 2023-09-30 01:52:30,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:33,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 01:52:35,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:52:36,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:36,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 01:52:38,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=558680.0, ans=0.0 2023-09-30 01:52:39,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:52:39,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:52:39,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:52:41,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 01:52:43,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:52:43,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:52:46,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 01:52:46,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:48,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:52:48,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=558746.6666666666, ans=0.1 2023-09-30 01:52:50,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:52,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=558746.6666666666, ans=0.125 2023-09-30 01:52:56,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:52:59,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:00,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:53:00,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=558813.3333333334, ans=0.05 2023-09-30 01:53:01,548 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.14 vs. limit=15.0 2023-09-30 01:53:10,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:10,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:53:13,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:16,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:53:18,253 INFO [train.py:1039] (0/4) Epoch 16, batch 4150, loss[loss=0.1843, simple_loss=0.2606, pruned_loss=0.05395, over 23285.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2589, pruned_loss=0.05582, over 4723815.88 frames. ], batch size: 93, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:53:20,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:53:21,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:53:21,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:53:21,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:22,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=558880.0, ans=0.0 2023-09-30 01:53:25,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 01:53:25,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:25,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 01:53:25,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=558880.0, ans=0.07 2023-09-30 01:53:27,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 01:53:27,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 01:53:29,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:33,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:53:33,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:34,221 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=558946.6666666666, ans=0.125 2023-09-30 01:53:37,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:53:39,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:53:39,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:53:42,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:53:42,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:42,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:53:49,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:52,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:53:54,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 01:53:55,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 01:53:55,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:53:57,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 01:53:57,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:53:57,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:01,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:02,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:08,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 01:54:10,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:11,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:13,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 01:54:13,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:54:15,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 01:54:19,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:54:19,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:22,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:22,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 01:54:22,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:22,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:54:22,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=559080.0, ans=0.0 2023-09-30 01:54:23,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:54:26,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 01:54:26,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:26,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:54:26,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:54:28,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 01:54:28,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:28,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:54:29,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:54:32,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:32,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 01:54:32,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:38,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:54:40,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 01:54:41,597 INFO [train.py:1039] (0/4) Epoch 16, batch 4200, loss[loss=0.1786, simple_loss=0.2574, pruned_loss=0.04989, over 24512.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2582, pruned_loss=0.05549, over 4709479.90 frames. ], batch size: 66, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:54:41,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:54:43,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:54:44,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:54:46,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:46,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:49,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 01:54:53,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 01:54:53,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:54,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:58,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:55:02,996 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.901e+02 2.194e+02 2.609e+02 4.040e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-30 01:55:03,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:55:03,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:05,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:05,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 01:55:05,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:55:06,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:08,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:55:08,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:55:09,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:55:10,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=559280.0, ans=0.2 2023-09-30 01:55:11,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 01:55:11,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:16,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:55:16,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:55:19,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:55:19,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:55:22,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:55:22,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 01:55:24,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:24,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:55:24,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=559346.6666666666, ans=0.125 2023-09-30 01:55:30,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:55:32,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:39,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:55:42,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 01:55:45,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:50,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:55:52,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:55:53,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 01:55:56,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:56:01,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:56:01,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:56:03,853 INFO [train.py:1039] (0/4) Epoch 16, batch 4250, loss[loss=0.1841, simple_loss=0.2659, pruned_loss=0.05112, over 24017.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2577, pruned_loss=0.05507, over 4704543.35 frames. ], batch size: 86, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:56:04,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:04,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=559546.6666666666, ans=0.2 2023-09-30 01:56:08,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=559546.6666666666, ans=0.125 2023-09-30 01:56:09,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:56:09,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 01:56:09,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:56:09,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=559546.6666666666, ans=0.125 2023-09-30 01:56:11,795 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.35 vs. limit=15.0 2023-09-30 01:56:12,874 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=559546.6666666666, ans=0.125 2023-09-30 01:56:13,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:15,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=559546.6666666666, ans=0.125 2023-09-30 01:56:17,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:21,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:21,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:22,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=559613.3333333334, ans=0.125 2023-09-30 01:56:23,710 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:56:24,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:56:24,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:56:25,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=559613.3333333334, ans=0.125 2023-09-30 01:56:26,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:26,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=559613.3333333334, ans=0.125 2023-09-30 01:56:26,747 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=559613.3333333334, ans=0.0 2023-09-30 01:56:27,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:30,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:31,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:56:33,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:34,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 01:56:40,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 01:56:40,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:42,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:42,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:43,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:56:43,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:43,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:48,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:56:49,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:56:50,816 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.30 vs. limit=22.5 2023-09-30 01:56:53,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:56:54,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:56,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 01:56:56,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:56:56,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 01:56:57,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:57:00,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:57:02,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:02,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:57:04,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 01:57:06,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:57:06,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:57:08,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=559813.3333333334, ans=0.0 2023-09-30 01:57:08,187 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=559813.3333333334, ans=0.0 2023-09-30 01:57:10,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:14,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:16,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:57:19,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:57:21,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:21,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:57:21,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:57:21,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 01:57:24,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:26,119 INFO [train.py:1039] (0/4) Epoch 16, batch 4300, loss[loss=0.1727, simple_loss=0.2434, pruned_loss=0.05099, over 24318.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2579, pruned_loss=0.05511, over 4698846.87 frames. ], batch size: 56, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:57:29,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:30,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:57:33,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:36,473 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.40 vs. limit=22.5 2023-09-30 01:57:38,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=559880.0, ans=0.125 2023-09-30 01:57:42,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:43,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 01:57:43,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:57:45,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:57:45,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:57:47,246 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.838e+02 2.077e+02 2.423e+02 4.089e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 01:57:47,382 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 01:57:49,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:57:51,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:57:55,060 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-84000.pt 2023-09-30 01:57:59,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 01:57:59,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:57:59,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 01:58:02,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:58:03,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:58:07,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:58:07,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:58:07,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:58:09,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:09,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:58:09,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 01:58:10,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 01:58:12,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:58:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:17,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:58:17,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:17,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:17,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 01:58:17,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 01:58:18,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 01:58:20,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:20,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 01:58:20,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 01:58:22,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=560080.0, ans=0.125 2023-09-30 01:58:25,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:27,444 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 01:58:27,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:58:29,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:29,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:32,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 01:58:33,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:58:33,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:35,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:58:35,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:35,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=560146.6666666666, ans=0.125 2023-09-30 01:58:36,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:58:38,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:58:38,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=560146.6666666666, ans=0.2 2023-09-30 01:58:41,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:41,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:43,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:43,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=560146.6666666666, ans=0.1 2023-09-30 01:58:47,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 01:58:49,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:58:50,638 INFO [train.py:1039] (0/4) Epoch 16, batch 4350, loss[loss=0.2074, simple_loss=0.2706, pruned_loss=0.07206, over 22739.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2586, pruned_loss=0.05548, over 4703792.99 frames. ], batch size: 322, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:58:54,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:56,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:59:00,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:59:00,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:59:05,747 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=560213.3333333334, ans=0.0 2023-09-30 01:59:06,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:59:10,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:59:13,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:59:13,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:16,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:59:20,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:59:22,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:59:27,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 01:59:27,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:28,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:36,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:39,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 01:59:42,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:43,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:59:48,104 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 01:59:51,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:51,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:59:52,780 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 01:59:52,896 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 01:59:52,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:52,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:54,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:59:55,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:56,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:56,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:59,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 01:59:59,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:59,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:59,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:00,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 02:00:02,437 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 02:00:02,446 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 02:00:02,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 02:00:03,109 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.91 vs. limit=15.0 2023-09-30 02:00:07,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:00:07,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:00:07,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:09,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:00:11,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 02:00:11,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=560480.0, ans=0.125 2023-09-30 02:00:13,451 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 02:00:14,780 INFO [train.py:1039] (0/4) Epoch 16, batch 4400, loss[loss=0.1842, simple_loss=0.2688, pruned_loss=0.04981, over 24612.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2591, pruned_loss=0.05544, over 4717955.09 frames. ], batch size: 68, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:00:14,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:19,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:19,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:22,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:00:24,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 02:00:24,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 02:00:24,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 02:00:24,570 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 02:00:26,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:00:26,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:27,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 02:00:29,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:30,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:30,873 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 02:00:35,163 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.842e+02 2.058e+02 2.254e+02 3.655e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:00:35,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:35,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 02:00:36,805 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 02:00:38,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 02:00:40,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 02:00:40,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 02:00:40,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:41,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:41,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:43,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:00:45,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 02:00:45,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 02:00:47,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:49,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:00:49,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:51,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:53,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:53,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 02:00:53,227 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 02:00:57,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:59,865 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.62 vs. limit=15.0 2023-09-30 02:01:04,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:01:07,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 02:01:10,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:01:13,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:16,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:01:16,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 02:01:18,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:01:18,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:01:18,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:01:18,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:01:24,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 02:01:28,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 02:01:29,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 02:01:29,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:01:29,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 02:01:31,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:01:34,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:01:36,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 02:01:37,435 INFO [train.py:1039] (0/4) Epoch 16, batch 4450, loss[loss=0.1948, simple_loss=0.2796, pruned_loss=0.055, over 24359.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2595, pruned_loss=0.05556, over 4715402.67 frames. ], batch size: 77, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:01:40,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:43,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:45,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:01:51,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:01:51,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:01:57,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:00,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:02:02,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:02:02,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:02,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 02:02:02,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:04,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:05,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:05,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:02:09,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:02:13,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,325 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:15,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:17,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=561013.3333333334, ans=0.125 2023-09-30 02:02:18,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:02:23,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:02:24,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 02:02:24,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 02:02:24,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:02:26,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:28,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 02:02:29,307 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.69 vs. limit=15.0 2023-09-30 02:02:32,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:02:37,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:37,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 02:02:37,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:37,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:37,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:02:37,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:39,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:43,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:02:44,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 02:02:46,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:02:47,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:50,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:51,071 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=561146.6666666666, ans=0.5 2023-09-30 02:02:52,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:52,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:02:52,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=561146.6666666666, ans=0.125 2023-09-30 02:02:54,644 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.08 vs. limit=6.0 2023-09-30 02:02:55,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:02:58,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 02:03:00,039 INFO [train.py:1039] (0/4) Epoch 16, batch 4500, loss[loss=0.1863, simple_loss=0.2723, pruned_loss=0.05021, over 24567.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2597, pruned_loss=0.05528, over 4729326.10 frames. ], batch size: 71, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:03:00,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:03:04,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:05,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 02:03:05,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 02:03:08,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:14,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:03:15,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:15,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:03:17,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:03:17,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:18,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:19,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=561280.0, ans=0.125 2023-09-30 02:03:23,474 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.923e+02 2.201e+02 2.757e+02 3.678e+02, threshold=4.403e+02, percent-clipped=0.0 2023-09-30 02:03:29,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:31,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:03:33,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:03:33,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:03:33,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=561346.6666666666, ans=0.125 2023-09-30 02:03:35,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:03:43,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:03:48,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:03:52,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:03:53,293 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.19 vs. limit=6.0 2023-09-30 02:03:55,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:03:57,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 02:03:58,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:03:58,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:01,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:01,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:04:03,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:04:03,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 02:04:03,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:04:03,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:08,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:04:08,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:04:11,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:14,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:04:14,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:04:15,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 02:04:17,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 02:04:17,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 02:04:17,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=561480.0, ans=0.1 2023-09-30 02:04:22,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 02:04:24,007 INFO [train.py:1039] (0/4) Epoch 16, batch 4550, loss[loss=0.2039, simple_loss=0.2734, pruned_loss=0.06724, over 23185.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2587, pruned_loss=0.05545, over 4722795.71 frames. ], batch size: 105, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:04:26,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 02:04:27,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:32,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:32,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:35,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:35,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=561546.6666666666, ans=0.025 2023-09-30 02:04:38,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:04:40,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:41,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:04:41,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:04:41,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:45,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:45,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:50,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:04:54,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 02:04:54,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 02:04:55,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:04:57,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 02:05:00,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 02:05:02,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:03,132 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.14 vs. limit=10.0 2023-09-30 02:05:05,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 02:05:06,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:05:11,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:05:13,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 02:05:16,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:18,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:18,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=561746.6666666666, ans=0.1 2023-09-30 02:05:19,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:19,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:22,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 02:05:22,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 02:05:22,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:05:22,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 02:05:25,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 02:05:25,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:27,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:27,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:28,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:28,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:05:31,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:05:31,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 02:05:33,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=561813.3333333334, ans=0.125 2023-09-30 02:05:34,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:34,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:05:34,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 02:05:34,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:05:37,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 02:05:40,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:05:40,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:05:43,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:05:43,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:43,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:05:46,094 INFO [train.py:1039] (0/4) Epoch 16, batch 4600, loss[loss=0.1951, simple_loss=0.2544, pruned_loss=0.06787, over 23732.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2573, pruned_loss=0.05483, over 4715723.03 frames. ], batch size: 179, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:05:46,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:05:47,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:05:49,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:51,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:55,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:05:55,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:05:55,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:05:56,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 02:05:58,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:06:02,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:06:02,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:05,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:09,848 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.941e+02 2.143e+02 2.418e+02 3.430e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 02:06:12,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 02:06:13,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:15,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:18,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:06:18,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:25,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 02:06:25,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:06:25,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:06:32,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=562013.3333333334, ans=0.125 2023-09-30 02:06:33,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:34,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:06:35,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:06:40,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 02:06:40,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:06:45,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:47,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:06:50,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:50,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 02:06:50,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:51,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 02:06:51,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:53,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:54,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:54,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:58,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:58,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 02:06:58,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 02:06:58,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 02:06:59,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:01,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:01,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:03,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:07:09,714 INFO [train.py:1039] (0/4) Epoch 16, batch 4650, loss[loss=0.1699, simple_loss=0.2533, pruned_loss=0.04323, over 24470.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2563, pruned_loss=0.05456, over 4707647.56 frames. ], batch size: 63, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:07:12,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=562213.3333333334, ans=0.125 2023-09-30 02:07:13,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:07:16,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:16,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:16,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:07:16,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:16,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:20,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:21,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 02:07:26,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:07:28,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 02:07:29,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:29,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 02:07:29,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:07:30,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 02:07:30,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 02:07:30,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:33,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:07:36,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:07:38,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:38,303 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 02:07:41,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:44,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 02:07:46,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:46,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:07:48,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 02:07:48,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:07:51,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:07:56,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:01,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:04,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:06,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:06,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:08:10,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 02:08:11,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 02:08:13,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 02:08:13,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 02:08:14,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:16,878 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=562480.0, ans=0.0 2023-09-30 02:08:19,998 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:08:21,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:08:21,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:23,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 02:08:23,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:23,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=562480.0, ans=0.125 2023-09-30 02:08:24,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:24,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:08:26,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:08:30,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:08:30,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:32,595 INFO [train.py:1039] (0/4) Epoch 16, batch 4700, loss[loss=0.1743, simple_loss=0.2529, pruned_loss=0.04779, over 24628.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2568, pruned_loss=0.05475, over 4716168.07 frames. ], batch size: 65, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:08:32,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:35,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:35,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:08:35,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:08:37,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:08:38,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:08:38,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 02:08:43,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=562546.6666666666, ans=0.125 2023-09-30 02:08:49,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:50,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:50,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:08:52,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:53,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:08:55,747 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.890e+02 2.173e+02 2.583e+02 3.915e+02, threshold=4.346e+02, percent-clipped=0.0 2023-09-30 02:08:59,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 02:08:59,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 02:09:02,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:02,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:09:03,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:09:07,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:15,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:09:16,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:09:18,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:09:24,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 02:09:26,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:09:27,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:34,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 02:09:35,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:09:39,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:09:39,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 02:09:40,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:40,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:44,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:44,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:09:44,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 02:09:47,324 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 02:09:48,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:51,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 02:09:53,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:54,804 INFO [train.py:1039] (0/4) Epoch 16, batch 4750, loss[loss=0.1604, simple_loss=0.2344, pruned_loss=0.0432, over 24403.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2576, pruned_loss=0.0551, over 4710209.39 frames. ], batch size: 58, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:09:55,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 02:09:57,314 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.43 vs. limit=15.0 2023-09-30 02:09:58,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:09:58,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:09:59,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=562880.0, ans=0.125 2023-09-30 02:10:02,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:04,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:10:04,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 02:10:04,999 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.52 vs. limit=15.0 2023-09-30 02:10:05,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:06,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=562880.0, ans=0.125 2023-09-30 02:10:08,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 02:10:10,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:10:10,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:10:11,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:13,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=562946.6666666666, ans=0.2 2023-09-30 02:10:18,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 02:10:23,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:10:26,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 02:10:27,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:29,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:31,429 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 02:10:31,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 02:10:38,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 02:10:39,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:42,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:10:44,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:10:44,455 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 02:10:44,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:10:47,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:10:49,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:10:50,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 02:10:51,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 02:10:53,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:53,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:10:54,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:54,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:10:56,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 02:10:59,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 02:11:02,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:07,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:11:07,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 02:11:07,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:08,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:11,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:11:12,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:12,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:11:16,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:17,292 INFO [train.py:1039] (0/4) Epoch 16, batch 4800, loss[loss=0.1646, simple_loss=0.2536, pruned_loss=0.03781, over 24620.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2585, pruned_loss=0.05554, over 4708147.20 frames. ], batch size: 68, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:11:17,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 02:11:17,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 02:11:18,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 02:11:22,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:11:22,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:23,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 02:11:28,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:28,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:33,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:11:34,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:35,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:36,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 02:11:38,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:38,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:11:40,585 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.887e+02 2.070e+02 2.375e+02 3.869e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 02:11:42,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:11:46,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:11:47,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:47,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:11:47,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:48,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:11:48,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=563280.0, ans=0.0 2023-09-30 02:11:49,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:49,686 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:11:50,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:54,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:55,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:11:58,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:11:59,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:02,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 02:12:02,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 02:12:04,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:04,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:12:05,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:12:05,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:05,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:12:05,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=563413.3333333334, ans=0.125 2023-09-30 02:12:06,641 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=13.89 vs. limit=15.0 2023-09-30 02:12:07,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:12:07,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:09,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=563413.3333333334, ans=0.125 2023-09-30 02:12:11,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:16,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:17,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:18,225 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-09-30 02:12:18,278 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.72 vs. limit=10.0 2023-09-30 02:12:22,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 02:12:22,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:23,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:24,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:12:24,975 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.94 vs. limit=22.5 2023-09-30 02:12:25,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:29,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:30,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:12:30,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:32,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:12:33,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:12:33,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:12:37,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:37,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:37,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:40,152 INFO [train.py:1039] (0/4) Epoch 16, batch 4850, loss[loss=0.1738, simple_loss=0.2514, pruned_loss=0.04816, over 24478.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2585, pruned_loss=0.05574, over 4705783.30 frames. ], batch size: 63, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:12:40,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 02:12:41,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 02:12:41,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:41,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:43,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:12:43,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:46,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:47,524 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.28 vs. limit=15.0 2023-09-30 02:12:52,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=563546.6666666666, ans=0.125 2023-09-30 02:12:53,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 02:12:55,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:56,081 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.11 vs. limit=15.0 2023-09-30 02:13:00,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:01,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:13:01,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:04,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:13:06,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:13:06,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:13:06,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 02:13:11,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:13:14,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:13:14,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:13:16,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:13:16,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 02:13:17,068 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.96 vs. limit=10.0 2023-09-30 02:13:19,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:19,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:19,512 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=563680.0, ans=0.125 2023-09-30 02:13:24,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:24,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 02:13:25,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 02:13:25,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:13:34,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:13:34,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 02:13:35,532 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.29 vs. limit=10.0 2023-09-30 02:13:37,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:13:37,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:13:39,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:13:40,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 02:13:40,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:42,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 02:13:42,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:42,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:13:44,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 02:13:48,235 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.81 vs. limit=6.0 2023-09-30 02:13:53,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:55,778 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=563813.3333333334, ans=0.125 2023-09-30 02:13:59,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:13:59,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:02,912 INFO [train.py:1039] (0/4) Epoch 16, batch 4900, loss[loss=0.1752, simple_loss=0.2628, pruned_loss=0.04379, over 24549.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2575, pruned_loss=0.05569, over 4700570.28 frames. ], batch size: 71, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:14:05,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 02:14:05,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:14:07,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=563880.0, ans=0.05 2023-09-30 02:14:10,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:12,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:12,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:14:13,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=563880.0, ans=0.1 2023-09-30 02:14:13,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=563880.0, ans=0.04949747468305833 2023-09-30 02:14:15,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 02:14:20,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 02:14:24,977 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.946e+02 2.133e+02 2.467e+02 3.436e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 02:14:27,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 02:14:27,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 02:14:28,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:28,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:28,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:14:28,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:28,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:14:30,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 02:14:34,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 02:14:36,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:14:37,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:14:39,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:41,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:14:42,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:42,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:42,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 02:14:44,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:14:44,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:46,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 02:14:46,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 02:14:49,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 02:14:50,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:14:51,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=564080.0, ans=0.125 2023-09-30 02:14:52,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:14:52,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:14:52,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:54,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:14:54,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:14:54,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 02:14:56,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:59,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:15:02,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:15:04,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 02:15:06,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:15:06,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:15:08,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 02:15:13,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:13,484 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=564146.6666666666, ans=0.125 2023-09-30 02:15:14,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:15:16,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 02:15:16,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:16,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:15:19,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:23,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:23,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:15:24,989 INFO [train.py:1039] (0/4) Epoch 16, batch 4950, loss[loss=0.182, simple_loss=0.2511, pruned_loss=0.05646, over 24400.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2571, pruned_loss=0.05527, over 4694620.59 frames. ], batch size: 58, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:15:25,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:25,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 02:15:27,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:15:28,434 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.28 vs. limit=12.0 2023-09-30 02:15:30,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:30,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:35,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 02:15:35,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 02:15:35,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:15:37,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 02:15:37,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:37,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:15:38,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:15:38,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:15:42,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:42,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:15:44,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:15:46,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:47,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:47,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:50,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:15:55,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:55,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=564280.0, ans=0.04949747468305833 2023-09-30 02:15:56,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:16:00,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:00,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:01,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:16:02,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 02:16:03,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 02:16:03,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=564346.6666666666, ans=0.1 2023-09-30 02:16:06,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:07,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=564346.6666666666, ans=0.1 2023-09-30 02:16:08,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:16:08,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:16:10,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:16:10,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:16:11,482 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.68 vs. limit=5.0 2023-09-30 02:16:11,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:16:13,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:17,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:16:20,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:16:22,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:23,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:23,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 02:16:24,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:16:26,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:16:28,808 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.77 vs. limit=15.0 2023-09-30 02:16:31,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:16:32,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:16:32,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:16:32,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:32,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:16:34,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:16:36,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:16:37,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:16:37,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:39,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 02:16:39,669 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=564480.0, ans=0.2 2023-09-30 02:16:45,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:16:47,452 INFO [train.py:1039] (0/4) Epoch 16, batch 5000, loss[loss=0.175, simple_loss=0.2495, pruned_loss=0.05021, over 24276.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2561, pruned_loss=0.05523, over 4678921.17 frames. ], batch size: 56, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:16:50,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 02:16:50,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:16:51,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=564546.6666666666, ans=0.2 2023-09-30 02:16:56,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:56,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:16:59,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 02:16:59,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 02:17:02,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:04,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 02:17:04,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:17:05,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:17:07,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 02:17:07,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:09,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:09,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 02:17:09,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:09,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:10,776 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.922e+02 2.167e+02 2.587e+02 4.159e+02, threshold=4.333e+02, percent-clipped=0.0 2023-09-30 02:17:12,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 02:17:12,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 02:17:13,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:17:13,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 02:17:13,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:17:14,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:15,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:17:15,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 02:17:15,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 02:17:17,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 02:17:17,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:19,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:19,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 02:17:20,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:17:22,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:22,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:24,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:17:26,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 02:17:26,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:17:26,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=564680.0, ans=0.125 2023-09-30 02:17:27,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:17:30,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=564680.0, ans=0.125 2023-09-30 02:17:30,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=564680.0, ans=0.1 2023-09-30 02:17:31,551 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 02:17:36,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:36,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:36,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:17:38,469 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.01 vs. limit=15.0 2023-09-30 02:17:40,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 02:17:40,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:41,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:42,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:17:44,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 02:17:46,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:49,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:49,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:54,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 02:17:59,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=564813.3333333334, ans=0.2 2023-09-30 02:18:02,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:02,774 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=564813.3333333334, ans=0.1 2023-09-30 02:18:10,336 INFO [train.py:1039] (0/4) Epoch 16, batch 5050, loss[loss=0.1891, simple_loss=0.2541, pruned_loss=0.06205, over 23758.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.256, pruned_loss=0.05518, over 4685727.45 frames. ], batch size: 232, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:18:10,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:18:12,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:12,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:18:12,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:13,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:18:13,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:18:13,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:15,238 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=564880.0, ans=0.125 2023-09-30 02:18:18,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:18,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 02:18:20,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:18:23,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:24,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:18:24,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 02:18:26,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:26,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:18:30,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:18:31,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:18:31,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:18:35,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=564946.6666666666, ans=0.125 2023-09-30 02:18:35,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=564946.6666666666, ans=0.1 2023-09-30 02:18:41,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 02:18:43,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:18:44,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:18:44,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 02:18:44,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:18:44,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:46,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:46,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:18:46,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 02:18:47,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 02:18:47,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:52,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:18:54,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:55,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 02:18:57,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:19:00,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 02:19:01,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:19:01,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:19:04,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:04,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:19:05,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:07,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:19:08,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:09,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:19:09,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:19:10,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 02:19:11,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:19:13,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:19:17,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:19:17,131 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 02:19:17,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:19:18,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:20,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:20,151 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 02:19:23,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:23,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 02:19:23,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:26,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:26,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:28,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 02:19:28,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 02:19:31,359 INFO [train.py:1039] (0/4) Epoch 16, batch 5100, loss[loss=0.1842, simple_loss=0.2548, pruned_loss=0.05678, over 23289.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2571, pruned_loss=0.05576, over 4678461.73 frames. ], batch size: 105, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:19:31,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:31,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:19:31,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:19:34,692 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 02:19:38,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:42,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 02:19:42,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 02:19:44,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:45,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:48,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:48,765 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=565280.0, ans=0.2 2023-09-30 02:19:50,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 02:19:50,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 02:19:53,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=565280.0, ans=0.125 2023-09-30 02:19:54,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:54,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:19:55,995 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.868e+02 2.082e+02 2.336e+02 3.756e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-30 02:19:57,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:20:00,696 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=565280.0, ans=0.125 2023-09-30 02:20:01,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 02:20:01,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:04,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:20:04,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 02:20:08,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:08,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:09,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 02:20:12,019 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 02:20:12,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:14,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 02:20:14,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 02:20:16,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:25,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:20:27,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 02:20:27,673 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=565413.3333333334, ans=0.0 2023-09-30 02:20:28,905 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 02:20:28,918 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 02:20:29,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 02:20:29,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:29,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=565413.3333333334, ans=0.1 2023-09-30 02:20:29,453 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:20:32,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 02:20:35,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 02:20:37,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:20:39,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:20:40,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 02:20:44,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:20:45,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 02:20:49,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=565480.0, ans=0.1 2023-09-30 02:20:51,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:20:51,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:20:51,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:20:53,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:20:54,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:20:56,291 INFO [train.py:1039] (0/4) Epoch 16, batch 5150, loss[loss=0.183, simple_loss=0.2664, pruned_loss=0.04982, over 24653.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2586, pruned_loss=0.05627, over 4677991.10 frames. ], batch size: 73, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:20:56,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:20:56,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 02:20:56,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 02:20:57,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 02:20:59,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:20:59,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 02:21:00,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:01,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:21:01,323 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=565546.6666666666, ans=0.125 2023-09-30 02:21:02,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:04,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:04,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=565546.6666666666, ans=0.125 2023-09-30 02:21:10,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:21:10,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 02:21:12,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:12,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:21:15,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:21:15,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:15,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:17,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:21:17,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:21:17,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 02:21:18,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:21:20,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:21:23,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:21:24,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 02:21:27,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:21:32,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:21:35,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 02:21:38,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:39,344 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.27 vs. limit=15.0 2023-09-30 02:21:45,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:46,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:50,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:21:50,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:21:53,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 02:21:58,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:22:00,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:22:00,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:22:00,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=565813.3333333334, ans=0.0 2023-09-30 02:22:01,132 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.24 vs. limit=15.0 2023-09-30 02:22:02,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:04,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:06,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 02:22:09,898 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.26 vs. limit=22.5 2023-09-30 02:22:10,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:10,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:22:14,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:22:14,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:22:14,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:22:15,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:22:15,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:22:15,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:22:19,307 INFO [train.py:1039] (0/4) Epoch 16, batch 5200, loss[loss=0.2492, simple_loss=0.3051, pruned_loss=0.09661, over 19860.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2593, pruned_loss=0.05659, over 4686644.49 frames. ], batch size: 388, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:22:19,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:22:22,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:22:23,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:29,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 02:22:29,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:22:30,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:32,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:34,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:22:34,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:39,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 02:22:41,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:22:41,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:44,238 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 1.894e+02 2.058e+02 2.319e+02 3.515e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:22:44,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 02:22:45,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:22:46,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:22:47,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 02:22:49,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 02:22:52,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 02:22:53,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:54,338 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 02:22:54,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:55,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:55,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:22:55,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 02:22:56,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=566013.3333333334, ans=0.125 2023-09-30 02:22:57,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:58,153 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.65 vs. limit=15.0 2023-09-30 02:22:58,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:02,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 02:23:02,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 02:23:04,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 02:23:08,624 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.67 vs. limit=22.5 2023-09-30 02:23:09,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 02:23:11,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:23:16,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:23:16,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:18,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 02:23:19,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:19,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:23:19,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:19,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:23:22,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:22,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:23:27,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:23:29,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:29,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:34,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:35,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 02:23:37,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:37,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:23:38,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:40,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:23:40,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:23:42,300 INFO [train.py:1039] (0/4) Epoch 16, batch 5250, loss[loss=0.1653, simple_loss=0.2433, pruned_loss=0.04365, over 24583.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.259, pruned_loss=0.05615, over 4690022.43 frames. ], batch size: 60, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:23:45,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:23:48,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=566213.3333333334, ans=0.2 2023-09-30 02:23:50,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:50,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:23:51,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:23:58,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:58,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:24:01,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:24:02,255 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.43 vs. limit=15.0 2023-09-30 02:24:03,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:24:05,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 02:24:05,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:24:08,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:24:19,477 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.84 vs. limit=15.0 2023-09-30 02:24:24,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=566346.6666666666, ans=0.2 2023-09-30 02:24:29,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=566413.3333333334, ans=0.1 2023-09-30 02:24:37,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=566413.3333333334, ans=0.0 2023-09-30 02:24:40,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=566413.3333333334, ans=0.125 2023-09-30 02:24:51,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=566480.0, ans=0.1 2023-09-30 02:24:51,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=566480.0, ans=0.1 2023-09-30 02:24:57,069 INFO [train.py:1039] (0/4) Epoch 16, batch 5300, loss[loss=0.1632, simple_loss=0.2247, pruned_loss=0.05084, over 23630.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2578, pruned_loss=0.05514, over 4701680.19 frames. ], batch size: 256, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:25:03,715 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.58 vs. limit=15.0 2023-09-30 02:25:12,134 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-16.pt 2023-09-30 02:25:17,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:25:17,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 02:25:17,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 02:25:17,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:18,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:18,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:18,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:18,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:18,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:18,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:18,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:25:19,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:25:19,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 02:25:19,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 02:25:19,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 02:25:19,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:25:19,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 02:25:20,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 02:25:20,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:20,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:20,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:20,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:21,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:25:22,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:22,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:22,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:22,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:22,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:22,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:25:22,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:22,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:25:23,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 02:25:23,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:23,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:23,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 02:25:23,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 02:25:24,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:25:24,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:24,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 02:25:24,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 02:25:24,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:25,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:25:25,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:26,066 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 02:25:26,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 02:25:26,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:25:26,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:26,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 02:25:26,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 02:25:26,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 02:25:26,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:29,925 INFO [train.py:1039] (0/4) Epoch 17, batch 0, loss[loss=0.1848, simple_loss=0.2589, pruned_loss=0.05536, over 23379.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2589, pruned_loss=0.05536, over 23379.00 frames. ], batch size: 119, lr: 6.17e-03, grad_scale: 32.0 2023-09-30 02:25:29,926 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 02:25:40,381 INFO [zipformer.py:1853] (0/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.4122, 2.7897, 3.7456, 2.4983], device='cuda:0') 2023-09-30 02:25:43,981 INFO [train.py:1071] (0/4) Epoch 17, validation: loss=0.3013, simple_loss=0.2697, pruned_loss=0.1665, over 1125622.00 frames. 2023-09-30 02:25:43,982 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 02:25:45,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 02:25:47,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:25:49,123 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.973e+02 2.191e+02 2.524e+02 3.767e+02, threshold=4.382e+02, percent-clipped=0.0 2023-09-30 02:25:49,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:25:56,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:56,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:25:56,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:58,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 02:26:00,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 02:26:01,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:02,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:03,373 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.75 vs. limit=15.0 2023-09-30 02:26:05,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:05,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:07,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:26:07,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:09,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 02:26:10,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:17,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=566760.0, ans=0.125 2023-09-30 02:26:19,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:26:19,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:19,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=566760.0, ans=0.5 2023-09-30 02:26:19,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=566760.0, ans=0.2 2023-09-30 02:26:21,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=566760.0, ans=0.0 2023-09-30 02:26:22,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 02:26:26,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:26:26,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:26:27,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:32,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:26:35,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:40,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 02:26:45,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 02:26:45,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:26:45,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:46,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:26:47,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:49,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 02:26:50,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=566893.3333333334, ans=0.0 2023-09-30 02:26:52,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:53,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:58,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:02,191 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 02:27:03,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:27:07,324 INFO [train.py:1039] (0/4) Epoch 17, batch 50, loss[loss=0.1887, simple_loss=0.2597, pruned_loss=0.05888, over 23586.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2607, pruned_loss=0.05625, over 1062329.15 frames. ], batch size: 256, lr: 6.17e-03, grad_scale: 16.0 2023-09-30 02:27:07,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:10,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:10,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 02:27:10,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:27:11,403 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.54 vs. limit=6.0 2023-09-30 02:27:12,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:27:13,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:15,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:18,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:22,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 02:27:22,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:27,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:27:30,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 02:27:31,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 02:27:32,702 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.48 vs. limit=6.0 2023-09-30 02:27:34,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:27:37,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:27:37,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:37,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:27:37,766 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-09-30 02:27:38,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:27:38,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:27:38,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:46,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:27:47,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:48,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:27:48,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 02:27:48,806 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=567093.3333333334, ans=0.125 2023-09-30 02:27:52,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:27:53,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:27:53,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 02:27:53,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:56,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 02:27:56,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=567160.0, ans=0.125 2023-09-30 02:28:05,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:05,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:28:05,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:06,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:06,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:09,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 02:28:09,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 02:28:13,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:13,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:13,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:28:13,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:28:15,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 02:28:16,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 02:28:18,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:28:18,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:18,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:28:19,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 02:28:19,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 02:28:19,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:21,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:24,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:28:24,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:28:24,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=567226.6666666666, ans=0.125 2023-09-30 02:28:28,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:28:29,422 INFO [train.py:1039] (0/4) Epoch 17, batch 100, loss[loss=0.1719, simple_loss=0.2512, pruned_loss=0.04631, over 24680.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2605, pruned_loss=0.05556, over 1874235.38 frames. ], batch size: 65, lr: 6.16e-03, grad_scale: 16.0 2023-09-30 02:28:31,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:28:34,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:36,147 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.907e+02 2.184e+02 2.612e+02 4.946e+02, threshold=4.368e+02, percent-clipped=2.0 2023-09-30 02:28:36,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 02:28:36,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:39,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:28:39,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:41,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:41,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:41,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:42,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 02:28:44,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:28:44,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:45,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:45,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:49,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 02:28:51,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:52,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:54,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:28:54,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=567360.0, ans=0.125 2023-09-30 02:28:54,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=567360.0, ans=0.125 2023-09-30 02:28:55,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:28:59,507 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 02:28:59,531 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 02:28:59,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:28:59,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:29:02,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:29:05,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:29:07,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,494 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 02:29:15,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:29:21,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:21,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:29:22,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:27,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:28,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:31,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:29:32,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:34,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:37,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:37,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:29:37,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:38,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 02:29:38,703 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 02:29:38,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:38,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:29:40,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:40,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:40,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:29:40,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=567560.0, ans=0.125 2023-09-30 02:29:41,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:29:41,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:29:41,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:41,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:43,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:43,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:29:45,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:29:46,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:49,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:29:51,219 INFO [train.py:1039] (0/4) Epoch 17, batch 150, loss[loss=0.2037, simple_loss=0.2683, pruned_loss=0.06955, over 23921.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2601, pruned_loss=0.05572, over 2499829.55 frames. ], batch size: 195, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:29:51,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:53,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:55,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:57,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:58,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:03,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 02:30:03,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 02:30:03,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 02:30:05,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:30:06,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:30:08,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:30:09,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:30:09,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:09,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:11,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:12,760 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 02:30:14,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:21,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:25,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:30:25,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 02:30:29,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:30:29,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:29,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:30:33,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:30:36,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:30:36,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:30:38,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:39,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 02:30:46,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:47,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:30:48,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:30:48,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:30:51,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:52,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 02:30:53,456 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.41 vs. limit=15.0 2023-09-30 02:30:55,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:30:57,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:30:59,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:02,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:31:02,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 02:31:02,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:31:03,823 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 02:31:04,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=567893.3333333334, ans=0.1 2023-09-30 02:31:06,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:09,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=567893.3333333334, ans=0.125 2023-09-30 02:31:12,024 INFO [train.py:1039] (0/4) Epoch 17, batch 200, loss[loss=0.2565, simple_loss=0.3208, pruned_loss=0.09616, over 19613.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2606, pruned_loss=0.05582, over 3000530.59 frames. ], batch size: 388, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:31:12,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:31:12,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:31:15,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 02:31:16,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:17,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=567960.0, ans=0.0 2023-09-30 02:31:18,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:19,205 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=567960.0, ans=0.05 2023-09-30 02:31:20,236 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.902e+02 2.115e+02 2.489e+02 3.841e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 02:31:22,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 02:31:23,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:31:23,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:25,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:28,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:31:28,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:28,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:39,943 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=568026.6666666666, ans=0.025 2023-09-30 02:31:40,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=568026.6666666666, ans=0.125 2023-09-30 02:31:50,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:31:51,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:31:53,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:31:53,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:31:55,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:31:55,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:31:56,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:57,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:31:58,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:58,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:00,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 02:32:00,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:32:00,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:03,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:32:10,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:32:15,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:15,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:32:20,000 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=568226.6666666666, ans=0.1 2023-09-30 02:32:22,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:26,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 02:32:26,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:27,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:32:27,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:29,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:32:30,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 02:32:32,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:32:32,412 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 02:32:35,454 INFO [train.py:1039] (0/4) Epoch 17, batch 250, loss[loss=0.1763, simple_loss=0.2613, pruned_loss=0.04572, over 24458.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2594, pruned_loss=0.05599, over 3363640.98 frames. ], batch size: 69, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:32:35,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:37,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:32:39,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:39,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=568293.3333333334, ans=0.125 2023-09-30 02:32:40,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:42,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:32:43,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:45,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:32:48,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:32:48,819 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=568293.3333333334, ans=0.125 2023-09-30 02:33:00,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:02,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:03,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:33:10,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:33:12,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:33:14,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:33:14,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:15,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:33:15,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:33:15,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:16,589 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.10 vs. limit=15.0 2023-09-30 02:33:18,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:33:23,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 02:33:23,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:24,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:33:24,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:33:24,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:33:25,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=568493.3333333334, ans=0.125 2023-09-30 02:33:26,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:33:27,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:33:28,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:33:31,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:31,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:33:31,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:35,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:33:40,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:43,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:33:48,486 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=568560.0, ans=0.125 2023-09-30 02:33:50,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:51,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:33:54,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 02:33:57,775 INFO [train.py:1039] (0/4) Epoch 17, batch 300, loss[loss=0.1694, simple_loss=0.2407, pruned_loss=0.049, over 24362.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2579, pruned_loss=0.05498, over 3670746.60 frames. ], batch size: 56, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:33:57,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:59,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:34:00,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 02:34:00,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:34:02,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:34:02,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 02:34:05,244 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.942e+02 2.251e+02 2.659e+02 4.378e+02, threshold=4.502e+02, percent-clipped=1.0 2023-09-30 02:34:07,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:34:07,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:10,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:34:11,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 02:34:14,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:34:14,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:34:14,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 02:34:14,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:16,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=568693.3333333334, ans=0.0 2023-09-30 02:34:16,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=568693.3333333334, ans=0.5 2023-09-30 02:34:17,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:34:22,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:34:24,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 02:34:27,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 02:34:27,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:30,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:33,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:33,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 02:34:33,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:34:35,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:34:36,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:34:37,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:34:42,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:34:42,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 02:34:44,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:34:47,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:48,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 02:34:49,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:54,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:34:57,392 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=568826.6666666666, ans=0.125 2023-09-30 02:34:59,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:35:00,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 02:35:03,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:03,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:35:07,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:08,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:35:08,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 02:35:08,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:35:08,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:10,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 02:35:13,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:13,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:14,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=568893.3333333334, ans=0.05 2023-09-30 02:35:15,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:15,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:15,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:20,312 INFO [train.py:1039] (0/4) Epoch 17, batch 350, loss[loss=0.1829, simple_loss=0.2616, pruned_loss=0.05206, over 23529.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2558, pruned_loss=0.05485, over 3893414.94 frames. ], batch size: 93, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:35:22,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:22,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:35:25,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:31,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:35,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:35,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:38,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 02:35:41,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:41,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 02:35:41,865 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=569026.6666666666, ans=0.2 2023-09-30 02:35:43,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:43,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 02:35:44,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:48,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 02:35:52,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:35:53,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:55,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:35:55,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:55,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:57,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:35:57,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:58,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:36:00,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:00,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:06,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:09,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:36:09,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:36:10,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:16,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 02:36:16,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:20,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:20,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:20,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:36:22,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 02:36:23,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:23,965 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 02:36:27,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 02:36:27,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:30,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:36:30,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 02:36:32,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:33,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:36:35,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:37,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:37,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:41,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:43,986 INFO [train.py:1039] (0/4) Epoch 17, batch 400, loss[loss=0.1838, simple_loss=0.2327, pruned_loss=0.06744, over 19326.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2552, pruned_loss=0.05474, over 4067485.64 frames. ], batch size: 388, lr: 6.15e-03, grad_scale: 16.0 2023-09-30 02:36:44,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:45,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:36:47,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 02:36:47,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:47,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:50,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:36:50,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:51,083 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.88 vs. limit=15.0 2023-09-30 02:36:51,852 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.862e+02 1.986e+02 2.213e+02 4.165e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 02:36:55,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:57,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:57,576 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=569293.3333333334, ans=0.1 2023-09-30 02:36:58,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 02:36:59,232 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:37:00,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 02:37:00,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:00,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=569360.0, ans=0.1 2023-09-30 02:37:02,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 02:37:04,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:05,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:37:05,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:05,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 02:37:07,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:37:07,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:08,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:08,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:37:12,458 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 02:37:12,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=569360.0, ans=0.0 2023-09-30 02:37:13,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 02:37:17,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:18,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:19,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 02:37:22,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 02:37:25,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:37:29,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:35,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 02:37:38,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:37:39,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=569493.3333333334, ans=0.125 2023-09-30 02:37:40,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 02:37:40,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:43,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:37:43,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 02:37:47,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:37:51,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:37:52,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:54,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:55,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 02:37:57,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:37:58,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 02:37:59,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:37:59,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:38:02,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 02:38:05,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:38:05,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:38:05,996 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:38:07,149 INFO [train.py:1039] (0/4) Epoch 17, batch 450, loss[loss=0.198, simple_loss=0.2621, pruned_loss=0.06696, over 23778.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2564, pruned_loss=0.05524, over 4192716.15 frames. ], batch size: 212, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:38:07,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:38:07,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 02:38:07,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:38:08,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:38:09,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:38:09,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 02:38:09,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:38:11,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:38:14,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:38:20,041 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.90 vs. limit=6.0 2023-09-30 02:38:24,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:24,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:38:26,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 02:38:28,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 02:38:32,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:38:36,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:37,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:42,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:42,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:45,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 02:38:45,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 02:38:47,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 02:38:48,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:38:48,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:50,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:38:52,620 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 02:38:52,635 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 02:38:52,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:54,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:38:57,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:39:00,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:39:02,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:39:02,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:39:04,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 02:39:05,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:08,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:39:09,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:39:11,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 02:39:16,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:39:16,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 02:39:18,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 02:39:19,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:23,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:39:26,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:27,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:39:27,890 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 02:39:29,812 INFO [train.py:1039] (0/4) Epoch 17, batch 500, loss[loss=0.1963, simple_loss=0.2606, pruned_loss=0.06604, over 23691.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2571, pruned_loss=0.05511, over 4316569.82 frames. ], batch size: 232, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:39:33,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:33,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:39:33,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:35,220 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 02:39:36,021 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.21 vs. limit=10.0 2023-09-30 02:39:36,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 02:39:36,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:39,661 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.860e+02 2.180e+02 2.486e+02 3.417e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:39:39,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:39:44,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:39:46,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:39:47,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:47,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:48,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:39:59,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:39:59,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:40:01,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:40:01,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:01,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 02:40:01,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:40:05,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:40:05,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:40:05,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:40:05,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:07,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 02:40:10,506 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 02:40:13,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:13,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=570093.3333333334, ans=0.125 2023-09-30 02:40:15,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:17,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:40:20,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 02:40:24,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:40:26,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:31,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:34,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:36,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=570226.6666666666, ans=0.125 2023-09-30 02:40:40,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:45,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 02:40:45,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:45,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:47,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 02:40:48,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:40:48,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:53,245 INFO [train.py:1039] (0/4) Epoch 17, batch 550, loss[loss=0.2186, simple_loss=0.2815, pruned_loss=0.07791, over 22674.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2576, pruned_loss=0.05484, over 4414708.61 frames. ], batch size: 322, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:40:53,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=570293.3333333334, ans=0.125 2023-09-30 02:40:54,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 02:40:56,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 02:40:56,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:56,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 02:40:57,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:40:57,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:59,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:00,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:00,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:41:02,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:41:05,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:41:06,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 02:41:06,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:41:09,959 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=570360.0, ans=0.125 2023-09-30 02:41:11,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:11,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:13,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:14,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:20,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 02:41:20,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 02:41:23,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:41:25,875 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.45 vs. limit=22.5 2023-09-30 02:41:26,781 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=570426.6666666666, ans=0.125 2023-09-30 02:41:27,366 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.65 vs. limit=15.0 2023-09-30 02:41:27,377 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.95 vs. limit=15.0 2023-09-30 02:41:27,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:41:27,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:29,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:41:34,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:34,612 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 02:41:36,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:37,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:41:39,393 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=570426.6666666666, ans=0.0 2023-09-30 02:41:42,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:42,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:41:42,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:41:44,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:45,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 02:41:47,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 02:41:47,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:41:47,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:47,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:41:47,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:41:51,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:41:54,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:41:57,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:41:57,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:59,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:42:00,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:42:02,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:02,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:42:02,989 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=22.5 2023-09-30 02:42:03,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:03,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:42:03,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:42:10,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=570560.0, ans=0.1 2023-09-30 02:42:12,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 02:42:15,728 INFO [train.py:1039] (0/4) Epoch 17, batch 600, loss[loss=0.1811, simple_loss=0.2682, pruned_loss=0.04702, over 24552.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2584, pruned_loss=0.0549, over 4480252.12 frames. ], batch size: 71, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:42:15,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 02:42:16,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:42:17,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:42:17,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:26,650 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 1.971e+02 2.265e+02 3.697e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 02:42:26,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:42:28,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:42:30,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 02:42:30,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=570626.6666666666, ans=0.125 2023-09-30 02:42:31,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:42:33,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:42:36,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:37,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 02:42:39,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:42:45,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 02:42:47,657 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=570760.0, ans=0.0 2023-09-30 02:42:48,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:42:48,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:48,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:42:49,793 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=570760.0, ans=0.125 2023-09-30 02:42:49,915 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=570760.0, ans=0.125 2023-09-30 02:42:56,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=570760.0, ans=0.0 2023-09-30 02:42:57,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:42:57,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:42:57,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:00,045 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.72 vs. limit=15.0 2023-09-30 02:43:04,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:43:09,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:09,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:43:09,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:43:10,952 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=570826.6666666666, ans=0.125 2023-09-30 02:43:18,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 02:43:24,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:43:24,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:43:29,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 02:43:31,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:43:33,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 02:43:33,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:43:34,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:43:39,676 INFO [train.py:1039] (0/4) Epoch 17, batch 650, loss[loss=0.1598, simple_loss=0.2209, pruned_loss=0.0493, over 23456.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2573, pruned_loss=0.05472, over 4535051.64 frames. ], batch size: 285, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:43:42,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:43:42,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:43:43,873 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.81 vs. limit=15.0 2023-09-30 02:43:44,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:43:45,017 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:43:46,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:43:49,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:43:51,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 02:43:52,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:57,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:43:57,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:01,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:06,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 02:44:07,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:07,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:11,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:11,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:44:14,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:14,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:14,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:44:16,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:17,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:44:17,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:44:19,414 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 02:44:19,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:19,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:24,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:24,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:26,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:26,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:44:27,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 02:44:29,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:44:29,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:44:30,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:44:30,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:32,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:44:34,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 02:44:36,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 02:44:37,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:37,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:37,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:44:39,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:41,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:41,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=571160.0, ans=0.125 2023-09-30 02:44:47,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:47,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:49,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:53,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:53,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:44:53,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:45:01,941 INFO [train.py:1039] (0/4) Epoch 17, batch 700, loss[loss=0.2038, simple_loss=0.2837, pruned_loss=0.06192, over 24030.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2567, pruned_loss=0.05429, over 4588301.42 frames. ], batch size: 80, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:45:02,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:45:02,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:02,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:02,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:06,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 02:45:06,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=571293.3333333334, ans=0.0 2023-09-30 02:45:08,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 02:45:11,381 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=571293.3333333334, ans=0.0 2023-09-30 02:45:12,194 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.891e+02 2.090e+02 2.521e+02 3.567e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 02:45:12,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 02:45:13,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:15,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:45:16,258 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.48 vs. limit=15.0 2023-09-30 02:45:16,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 02:45:20,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:23,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:45:26,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:28,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:45:28,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:45:30,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:33,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 02:45:33,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:45:34,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 02:45:38,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 02:45:42,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:45:42,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:45:45,124 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.06 vs. limit=15.0 2023-09-30 02:45:45,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:45:48,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:45:49,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=571426.6666666666, ans=0.0 2023-09-30 02:45:50,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 02:45:54,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=571493.3333333334, ans=0.125 2023-09-30 02:45:55,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:55,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:45:55,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 02:45:58,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:46:00,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:03,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:08,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:46:09,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 02:46:13,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 02:46:13,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 02:46:15,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:19,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:19,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:46:19,551 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=571560.0, ans=0.125 2023-09-30 02:46:22,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:22,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 02:46:24,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=571626.6666666666, ans=0.1 2023-09-30 02:46:25,715 INFO [train.py:1039] (0/4) Epoch 17, batch 750, loss[loss=0.1678, simple_loss=0.2534, pruned_loss=0.04112, over 24654.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2563, pruned_loss=0.05427, over 4625413.32 frames. ], batch size: 73, lr: 6.14e-03, grad_scale: 4.0 2023-09-30 02:46:28,190 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.91 vs. limit=10.0 2023-09-30 02:46:28,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 02:46:28,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 02:46:28,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 02:46:30,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 02:46:30,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 02:46:31,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:46:32,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 02:46:33,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:35,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:46:36,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:38,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:38,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:46:38,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:42,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:46:42,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:46:43,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=571693.3333333334, ans=0.07 2023-09-30 02:46:45,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:46:48,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:48,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:49,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 02:46:51,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:46:51,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:53,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:57,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:46:59,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 02:46:59,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:00,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 02:47:00,819 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 02:47:00,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 02:47:00,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:47:00,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:47:03,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:47:10,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:47:11,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:11,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:47:13,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:47:14,847 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.39 vs. limit=15.0 2023-09-30 02:47:15,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=571826.6666666666, ans=0.2 2023-09-30 02:47:17,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:17,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 02:47:17,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:47:18,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:47:20,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:47:21,324 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.14 vs. limit=8.0 2023-09-30 02:47:23,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:47:24,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 02:47:24,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:31,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:47:33,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:47:33,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:47:36,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:47:39,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 02:47:40,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:40,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:46,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:48,749 INFO [train.py:1039] (0/4) Epoch 17, batch 800, loss[loss=0.1751, simple_loss=0.263, pruned_loss=0.04358, over 24508.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.257, pruned_loss=0.054, over 4663500.63 frames. ], batch size: 66, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:47:48,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:47:55,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:55,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:55,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=571960.0, ans=0.1 2023-09-30 02:47:57,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:57,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:58,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:00,275 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.820e+02 2.048e+02 2.346e+02 3.292e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 02:48:00,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:02,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:05,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:07,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:48:10,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 02:48:12,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:13,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:48:15,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:48:15,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:15,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=572026.6666666666, ans=0.125 2023-09-30 02:48:16,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 02:48:16,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:18,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 02:48:21,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:24,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:26,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:48:26,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:28,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:28,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:32,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:48:34,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:48:34,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:48:36,499 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 02:48:36,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 02:48:36,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:48:36,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:48:39,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:39,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:48:44,882 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 02:48:46,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 02:48:48,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:48:51,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:48:51,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=572160.0, ans=0.125 2023-09-30 02:48:54,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:48:58,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:00,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 02:49:00,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:49:04,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 02:49:06,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=572226.6666666666, ans=0.125 2023-09-30 02:49:09,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:11,223 INFO [train.py:1039] (0/4) Epoch 17, batch 850, loss[loss=0.1983, simple_loss=0.2806, pruned_loss=0.05798, over 24552.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2577, pruned_loss=0.054, over 4678071.82 frames. ], batch size: 71, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:49:11,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:49:12,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 02:49:12,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:49:14,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:14,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 02:49:14,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:16,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:49:18,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:18,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=572293.3333333334, ans=0.125 2023-09-30 02:49:19,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:49:21,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:49:23,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 02:49:23,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=572293.3333333334, ans=0.125 2023-09-30 02:49:24,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 02:49:24,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 02:49:26,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:26,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:49:29,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:29,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:29,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:49:31,637 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.56 vs. limit=22.5 2023-09-30 02:49:35,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:35,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:37,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 02:49:40,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 02:49:44,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:44,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 02:49:47,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 02:49:49,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 02:49:51,527 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 02:49:51,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:49:51,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:49:52,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:49:54,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:56,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:56,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 02:49:59,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:50:01,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:01,560 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=572493.3333333334, ans=0.125 2023-09-30 02:50:02,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:50:02,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:50:04,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:50:04,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=572493.3333333334, ans=0.2 2023-09-30 02:50:06,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:50:06,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 02:50:09,852 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=572493.3333333334, ans=0.1 2023-09-30 02:50:11,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:50:11,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:12,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:50:12,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:14,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:15,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:50:19,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:50:21,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:50:21,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:22,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:50:29,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:50:31,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:32,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 02:50:32,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=572626.6666666666, ans=0.1 2023-09-30 02:50:33,898 INFO [train.py:1039] (0/4) Epoch 17, batch 900, loss[loss=0.2046, simple_loss=0.2788, pruned_loss=0.06523, over 24414.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2586, pruned_loss=0.0544, over 4693767.70 frames. ], batch size: 77, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:50:33,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:33,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:37,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 02:50:43,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=572626.6666666666, ans=0.125 2023-09-30 02:50:44,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:50:45,735 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.971e+02 2.244e+02 2.720e+02 3.662e+02, threshold=4.487e+02, percent-clipped=0.0 2023-09-30 02:50:47,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:49,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 02:50:50,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=572693.3333333334, ans=0.0 2023-09-30 02:50:52,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:50:52,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 02:50:52,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=572693.3333333334, ans=0.1 2023-09-30 02:50:53,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:50:55,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:55,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:50:55,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:50:55,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:51:00,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=572693.3333333334, ans=0.125 2023-09-30 02:51:05,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:05,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:51:07,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:51:10,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=572760.0, ans=0.125 2023-09-30 02:51:11,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:15,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 02:51:18,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:51:20,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:51:22,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:51:22,353 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 02:51:23,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 02:51:30,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:51:30,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:51:32,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:51:40,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:40,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:51:42,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 02:51:42,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:45,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 02:51:46,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:51:47,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:49,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:51:49,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:51:53,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 02:51:55,181 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 02:51:55,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:51:55,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 02:51:56,798 INFO [train.py:1039] (0/4) Epoch 17, batch 950, loss[loss=0.1924, simple_loss=0.2586, pruned_loss=0.0631, over 23556.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.259, pruned_loss=0.0544, over 4697987.95 frames. ], batch size: 134, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:51:58,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:03,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 02:52:06,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=572960.0, ans=0.125 2023-09-30 02:52:08,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:10,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:52:12,553 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 02:52:18,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:18,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:18,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:18,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:52:20,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 02:52:20,306 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=573026.6666666666, ans=0.125 2023-09-30 02:52:22,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:52:23,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:27,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 02:52:28,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:29,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=573093.3333333334, ans=0.125 2023-09-30 02:52:33,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:33,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:34,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:34,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 02:52:34,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=573093.3333333334, ans=0.1 2023-09-30 02:52:37,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:52:38,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:40,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:52:44,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:52:44,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:48,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 02:52:51,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 02:52:51,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:52:51,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:52:51,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:51,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:52:56,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 02:52:58,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:53:01,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:03,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:03,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 02:53:03,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:03,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:53:03,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 02:53:03,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=573226.6666666666, ans=6.0 2023-09-30 02:53:08,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:53:10,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:14,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:16,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 02:53:16,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 02:53:20,120 INFO [train.py:1039] (0/4) Epoch 17, batch 1000, loss[loss=0.1956, simple_loss=0.279, pruned_loss=0.05607, over 24051.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2584, pruned_loss=0.05414, over 4704386.27 frames. ], batch size: 80, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:53:20,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:23,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 02:53:23,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:53:29,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:53:30,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 02:53:30,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 02:53:30,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=573293.3333333334, ans=0.0 2023-09-30 02:53:32,100 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.898e+02 2.133e+02 2.497e+02 3.739e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-30 02:53:36,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:36,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:36,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=573360.0, ans=0.125 2023-09-30 02:53:37,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:42,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 02:53:45,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 02:53:47,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 02:53:47,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:53:51,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 02:53:52,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 02:53:53,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 02:53:54,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:54,416 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=573426.6666666666, ans=0.125 2023-09-30 02:53:55,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:03,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:05,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:54:05,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:06,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:06,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 02:54:06,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:07,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=573426.6666666666, ans=0.125 2023-09-30 02:54:08,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:54:08,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:08,598 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 02:54:09,272 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.44 vs. limit=15.0 2023-09-30 02:54:10,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=573493.3333333334, ans=0.0 2023-09-30 02:54:13,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 02:54:14,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 02:54:17,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 02:54:18,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:54:27,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:54:27,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:54:28,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 02:54:30,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:54:30,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 02:54:32,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 02:54:33,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:54:33,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:36,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=573560.0, ans=0.2 2023-09-30 02:54:37,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:54:39,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:54:39,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:43,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=573626.6666666666, ans=0.125 2023-09-30 02:54:44,345 INFO [train.py:1039] (0/4) Epoch 17, batch 1050, loss[loss=0.1807, simple_loss=0.2498, pruned_loss=0.0558, over 23617.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.257, pruned_loss=0.05381, over 4709585.68 frames. ], batch size: 149, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:54:44,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:54:44,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:54:44,831 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=573626.6666666666, ans=0.1 2023-09-30 02:54:48,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:54:48,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=573626.6666666666, ans=0.125 2023-09-30 02:54:49,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:51,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:54:54,338 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=573626.6666666666, ans=0.1 2023-09-30 02:54:55,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:54:57,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:54:59,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:55:01,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:55:01,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:55:02,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:55:02,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 02:55:04,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:05,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 02:55:08,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:55:08,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 02:55:08,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 02:55:15,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:55:17,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:55:17,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:21,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 02:55:21,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 02:55:21,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:55:24,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 02:55:28,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 02:55:28,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:55:31,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:55:33,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 02:55:33,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:55:35,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:55:39,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:55:44,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 02:55:45,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 02:55:46,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 02:55:46,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:46,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:55:48,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 02:55:51,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:55:54,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:54,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:55:56,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:55:56,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:00,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:00,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 02:56:01,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:56:01,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 02:56:01,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 02:56:03,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:56:06,418 INFO [train.py:1039] (0/4) Epoch 17, batch 1100, loss[loss=0.1752, simple_loss=0.2441, pruned_loss=0.0531, over 23662.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2568, pruned_loss=0.05357, over 4719447.39 frames. ], batch size: 232, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:56:06,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:13,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:56:18,389 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.918e+02 2.180e+02 2.425e+02 3.491e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:56:18,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=573960.0, ans=0.125 2023-09-30 02:56:20,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:56:20,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:56:20,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:21,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 02:56:23,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:56:25,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:56:28,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:56:32,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:56:32,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 02:56:35,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:56:36,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:36,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:56:37,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=574026.6666666666, ans=0.0 2023-09-30 02:56:38,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:56:42,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:56:48,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:56:50,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 02:56:51,458 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 02:56:51,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:51,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=574093.3333333334, ans=0.125 2023-09-30 02:56:53,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=574093.3333333334, ans=0.025 2023-09-30 02:56:55,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:55,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:56:55,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:56,834 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=574160.0, ans=0.125 2023-09-30 02:56:56,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=574160.0, ans=0.07 2023-09-30 02:56:58,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 02:56:58,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:56:58,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:56:58,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:56:59,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:59,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 02:57:06,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:57:06,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 02:57:09,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:57:14,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:57:17,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=574226.6666666666, ans=0.1 2023-09-30 02:57:18,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 02:57:18,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:57:18,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:20,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:22,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:22,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 02:57:24,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:57:24,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:27,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 02:57:27,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:57:27,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 02:57:29,015 INFO [train.py:1039] (0/4) Epoch 17, batch 1150, loss[loss=0.1937, simple_loss=0.2648, pruned_loss=0.06128, over 23281.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.257, pruned_loss=0.05401, over 4715923.57 frames. ], batch size: 119, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:57:29,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:57:29,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:57:30,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:57:36,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:39,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:57:42,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:42,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:57:44,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 02:57:44,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:57:47,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 02:57:47,637 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=574360.0, ans=0.125 2023-09-30 02:57:48,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:48,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:57:52,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 02:57:55,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:59,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:58:01,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:01,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 02:58:01,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:58:01,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:58:04,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 02:58:04,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:58:05,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:58:06,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=574426.6666666666, ans=0.125 2023-09-30 02:58:06,050 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=574426.6666666666, ans=0.125 2023-09-30 02:58:17,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:22,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:24,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 02:58:24,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:24,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:30,934 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 02:58:33,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:42,089 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 02:58:46,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:58:47,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=574560.0, ans=0.125 2023-09-30 02:58:48,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:58:48,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:58:48,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:58:51,686 INFO [train.py:1039] (0/4) Epoch 17, batch 1200, loss[loss=0.1988, simple_loss=0.2825, pruned_loss=0.05751, over 24644.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2577, pruned_loss=0.05424, over 4712227.74 frames. ], batch size: 68, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 02:58:51,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:58:55,982 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=574626.6666666666, ans=10.0 2023-09-30 02:58:57,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:58:59,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:59:01,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:01,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:01,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:59:03,932 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.927e+02 2.114e+02 2.514e+02 4.321e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 02:59:04,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:59:05,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:59:07,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:07,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:11,715 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 02:59:13,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 02:59:14,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=574693.3333333334, ans=0.125 2023-09-30 02:59:17,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:59:20,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:59:21,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:25,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:59:25,296 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 02:59:25,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:27,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=574760.0, ans=0.125 2023-09-30 02:59:35,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:59:35,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:59:35,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 02:59:37,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:59:40,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 02:59:43,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 02:59:45,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:45,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:46,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:46,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:59:48,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:48,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:59:50,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:59:50,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 02:59:50,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:59:51,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:59:51,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:59:53,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:53,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:58,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:00:01,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:00:04,474 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.63 vs. limit=15.0 2023-09-30 03:00:06,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 03:00:10,294 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 03:00:13,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:14,776 INFO [train.py:1039] (0/4) Epoch 17, batch 1250, loss[loss=0.1611, simple_loss=0.2336, pruned_loss=0.04432, over 24418.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2585, pruned_loss=0.05498, over 4715434.73 frames. ], batch size: 58, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:00:14,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:00:15,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:00:16,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:00:21,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 03:00:24,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:00:26,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:27,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 03:00:30,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:00:32,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:00:36,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:00:36,658 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.27 vs. limit=12.0 2023-09-30 03:00:37,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:37,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:00:37,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:40,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:00:44,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:00:46,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:00:46,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:47,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:48,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:00:51,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:00:52,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:00:57,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 03:00:59,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:01:02,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:02,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 03:01:02,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:01:02,487 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 03:01:03,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:03,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:07,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:09,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=575160.0, ans=0.125 2023-09-30 03:01:12,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:12,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:01:13,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 03:01:14,256 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:01:15,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 03:01:15,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 03:01:18,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:19,888 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.77 vs. limit=22.5 2023-09-30 03:01:21,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 03:01:21,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:23,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:01:23,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:01:24,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 03:01:24,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:01:24,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:01:24,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:01:26,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:27,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 03:01:30,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:33,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:01:34,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:01:37,368 INFO [train.py:1039] (0/4) Epoch 17, batch 1300, loss[loss=0.1815, simple_loss=0.2731, pruned_loss=0.04492, over 24638.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2585, pruned_loss=0.05518, over 4717311.75 frames. ], batch size: 68, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:01:38,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:01:42,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:42,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 03:01:48,564 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.803e+02 1.977e+02 2.132e+02 2.913e+02, threshold=3.954e+02, percent-clipped=0.0 2023-09-30 03:01:48,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:50,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:01:51,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:01:54,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:54,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:01:54,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 03:01:59,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:02:00,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:02:02,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 03:02:03,046 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.28 vs. limit=10.0 2023-09-30 03:02:05,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:02:07,828 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.76 vs. limit=15.0 2023-09-30 03:02:09,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:10,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:12,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:02:14,268 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=575426.6666666666, ans=0.2 2023-09-30 03:02:15,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:15,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:02:16,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:02:16,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 03:02:22,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:02:22,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:02:25,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 03:02:26,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:02:26,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:02:28,801 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.16 vs. limit=12.0 2023-09-30 03:02:29,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:02:31,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 03:02:31,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:32,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 03:02:34,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:37,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:37,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:02:42,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 03:02:42,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 03:02:44,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 03:02:49,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:02:49,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=575560.0, ans=0.0 2023-09-30 03:02:51,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=575560.0, ans=0.125 2023-09-30 03:02:52,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 03:02:52,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:03:01,428 INFO [train.py:1039] (0/4) Epoch 17, batch 1350, loss[loss=0.1965, simple_loss=0.2397, pruned_loss=0.07668, over 19546.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2573, pruned_loss=0.05509, over 4694366.82 frames. ], batch size: 388, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:03:03,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 03:03:07,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:09,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:11,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:03:11,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:14,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:03:14,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 03:03:22,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:03:23,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:03:26,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 03:03:27,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:03:27,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:03:27,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 03:03:29,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 03:03:31,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 03:03:34,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:34,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 03:03:46,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:56,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:56,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:03:58,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 03:03:59,240 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.15 vs. limit=15.0 2023-09-30 03:04:01,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:02,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 03:04:02,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:04:04,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:04:06,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:04:10,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 03:04:11,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:04:17,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 03:04:19,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 03:04:24,863 INFO [train.py:1039] (0/4) Epoch 17, batch 1400, loss[loss=0.1841, simple_loss=0.2496, pruned_loss=0.05932, over 23671.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2565, pruned_loss=0.05422, over 4717456.28 frames. ], batch size: 232, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:04:24,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 03:04:26,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:29,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:04:31,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:04:34,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 03:04:35,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 03:04:36,886 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.880e+02 2.143e+02 2.482e+02 5.482e+02, threshold=4.285e+02, percent-clipped=2.0 2023-09-30 03:04:48,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:04:50,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:04:52,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:04:52,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:04:55,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:04:57,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 03:05:01,107 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.95 vs. limit=12.0 2023-09-30 03:05:06,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:08,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:11,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 03:05:12,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:05:13,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:05:13,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:05:15,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:15,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:05:17,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:05:17,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:05:17,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 03:05:17,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:05:19,878 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.68 vs. limit=12.0 2023-09-30 03:05:22,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:25,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=576160.0, ans=0.0 2023-09-30 03:05:27,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:05:30,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=576226.6666666666, ans=0.0 2023-09-30 03:05:35,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=576226.6666666666, ans=0.0 2023-09-30 03:05:36,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 03:05:37,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:05:37,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:05:41,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 03:05:43,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:44,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:05:46,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:05:48,362 INFO [train.py:1039] (0/4) Epoch 17, batch 1450, loss[loss=0.2138, simple_loss=0.2746, pruned_loss=0.07646, over 23725.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2559, pruned_loss=0.05416, over 4704806.88 frames. ], batch size: 164, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:05:48,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:05:48,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:50,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:05:55,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:56,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:05:58,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:58,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 03:05:59,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:06:02,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 03:06:02,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:05,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:05,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 03:06:05,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:06,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:06:06,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 03:06:08,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:08,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:06:11,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:12,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:17,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:06:17,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:06:19,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:06:21,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:22,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:22,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:06:24,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:24,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:28,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 03:06:30,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=576426.6666666666, ans=0.0 2023-09-30 03:06:31,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:35,197 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 03:06:35,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:37,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:06:38,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:41,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 03:06:44,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:46,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 03:06:46,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 03:06:47,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:51,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:06:51,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:53,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 03:06:56,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 03:06:56,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 03:06:58,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:58,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:07:07,661 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=576560.0, ans=0.125 2023-09-30 03:07:11,606 INFO [train.py:1039] (0/4) Epoch 17, batch 1500, loss[loss=0.2182, simple_loss=0.2779, pruned_loss=0.07922, over 19591.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2573, pruned_loss=0.05484, over 4704148.31 frames. ], batch size: 388, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:07:11,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 03:07:11,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:07:11,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:07:12,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=576626.6666666666, ans=0.07 2023-09-30 03:07:13,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:13,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:14,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:07:15,532 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.51 vs. limit=22.5 2023-09-30 03:07:16,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 03:07:16,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:07:16,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:07:18,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:19,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:07:21,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:07:21,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:22,944 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 1.876e+02 2.186e+02 2.555e+02 3.680e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 03:07:26,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:26,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 03:07:27,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:07:27,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:07:29,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:35,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 03:07:39,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 03:07:42,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:42,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 03:07:45,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:07:48,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:07:49,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:49,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:07:51,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 03:07:52,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:07:53,070 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=576760.0, ans=0.0 2023-09-30 03:07:54,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:07:54,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 03:07:54,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:07:58,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=576760.0, ans=0.0 2023-09-30 03:08:01,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:08:01,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 03:08:07,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:08:08,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=576826.6666666666, ans=0.125 2023-09-30 03:08:08,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=576826.6666666666, ans=0.1 2023-09-30 03:08:09,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:08:14,742 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 03:08:14,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:14,837 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 03:08:17,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:19,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:08:21,476 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 03:08:21,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:08:23,279 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=576893.3333333334, ans=0.125 2023-09-30 03:08:23,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=576893.3333333334, ans=0.1 2023-09-30 03:08:24,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 03:08:26,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:27,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=576893.3333333334, ans=0.0 2023-09-30 03:08:29,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:30,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:30,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:30,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:32,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:08:34,392 INFO [train.py:1039] (0/4) Epoch 17, batch 1550, loss[loss=0.1705, simple_loss=0.2591, pruned_loss=0.04098, over 24448.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2578, pruned_loss=0.05466, over 4719745.01 frames. ], batch size: 69, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:08:34,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 03:08:35,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 03:08:36,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:08:36,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 03:08:37,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 03:08:39,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:40,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:42,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:08:42,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:08:43,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:45,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:49,574 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 03:08:49,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:49,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:08:51,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:08:53,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:08:53,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 03:08:56,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:56,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 03:08:58,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 03:08:58,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 03:08:58,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:59,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:02,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:09:05,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 03:09:05,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 03:09:07,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.07 vs. limit=12.0 2023-09-30 03:09:16,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:18,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:09:19,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:09:19,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:09:20,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 03:09:25,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:09:26,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:31,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:09:31,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=577160.0, ans=0.0 2023-09-30 03:09:34,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:09:34,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:34,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 03:09:34,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:36,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:09:36,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=577160.0, ans=0.125 2023-09-30 03:09:38,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:38,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:09:38,162 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 03:09:41,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:09:47,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 03:09:52,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:54,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:54,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 03:09:55,670 INFO [train.py:1039] (0/4) Epoch 17, batch 1600, loss[loss=0.1723, simple_loss=0.2556, pruned_loss=0.04451, over 24330.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2581, pruned_loss=0.05446, over 4734498.73 frames. ], batch size: 61, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:09:55,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:55,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:55,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:09:56,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:09:56,957 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.91 vs. limit=10.0 2023-09-30 03:09:58,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:10:00,580 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.90 vs. limit=15.0 2023-09-30 03:10:01,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:01,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=577293.3333333334, ans=0.0 2023-09-30 03:10:02,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 03:10:04,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 03:10:05,247 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.86 vs. limit=22.5 2023-09-30 03:10:06,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 03:10:06,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:07,919 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.883e+02 2.101e+02 2.421e+02 4.828e+02, threshold=4.202e+02, percent-clipped=4.0 2023-09-30 03:10:08,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 03:10:09,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:10:11,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:10:16,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:10:19,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 03:10:22,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:10:22,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 03:10:24,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:24,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 03:10:32,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 03:10:37,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=577426.6666666666, ans=0.125 2023-09-30 03:10:37,560 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=577426.6666666666, ans=0.025 2023-09-30 03:10:40,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:40,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 03:10:41,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=577426.6666666666, ans=0.125 2023-09-30 03:10:42,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:42,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:42,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:10:45,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:10:50,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:10:53,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:10:53,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:54,501 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.53 vs. limit=5.0 2023-09-30 03:10:55,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:55,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:10:58,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:10:58,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:11:01,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:11:07,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:09,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:11:11,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 03:11:11,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:11:12,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 03:11:18,130 INFO [train.py:1039] (0/4) Epoch 17, batch 1650, loss[loss=0.2062, simple_loss=0.2711, pruned_loss=0.07067, over 23677.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.259, pruned_loss=0.05541, over 4711341.67 frames. ], batch size: 179, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:11:18,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:21,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:11:21,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:11:21,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 03:11:22,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=577626.6666666666, ans=0.0 2023-09-30 03:11:23,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 03:11:23,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 03:11:23,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 03:11:23,565 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=577626.6666666666, ans=0.125 2023-09-30 03:11:27,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:29,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:29,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:11:29,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:11:31,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:32,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 03:11:35,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:11:35,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:35,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:11:36,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:11:36,254 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=577693.3333333334, ans=0.0 2023-09-30 03:11:37,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 03:11:37,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 03:11:43,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=577693.3333333334, ans=0.1 2023-09-30 03:11:45,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:11:48,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:11:58,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 03:11:59,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:00,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 03:12:04,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:07,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:12:07,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:12:07,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:08,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:12:08,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:11,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:11,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:13,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:13,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:14,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:14,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:12:19,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:19,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 03:12:21,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:22,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 03:12:22,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 03:12:22,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 03:12:22,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:24,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:12:24,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:24,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:24,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 03:12:25,080 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=15.0 2023-09-30 03:12:29,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:30,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:12:31,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:32,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 03:12:38,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:38,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:12:39,906 INFO [train.py:1039] (0/4) Epoch 17, batch 1700, loss[loss=0.1951, simple_loss=0.2803, pruned_loss=0.05496, over 24435.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2582, pruned_loss=0.05487, over 4720741.66 frames. ], batch size: 69, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:12:39,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 03:12:40,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:12:40,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:12:40,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:43,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:12:44,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:12:44,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 03:12:48,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:12:50,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=577960.0, ans=0.2 2023-09-30 03:12:51,380 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.846e+02 2.021e+02 2.222e+02 3.253e+02, threshold=4.041e+02, percent-clipped=0.0 2023-09-30 03:12:56,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:00,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:13:05,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=578026.6666666666, ans=0.125 2023-09-30 03:13:07,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:13:07,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:08,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:13:08,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:10,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 03:13:11,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:13:13,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:13,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:13:14,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:13:17,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 03:13:17,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 03:13:20,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:22,335 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.85 vs. limit=15.0 2023-09-30 03:13:23,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 03:13:24,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:13:31,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:35,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:35,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:37,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:13:37,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 03:13:38,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:40,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:40,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 03:13:41,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:13:41,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:41,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:41,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:13:43,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:43,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:13:45,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:45,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:13:46,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:50,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=578226.6666666666, ans=0.2 2023-09-30 03:13:53,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:53,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 03:13:54,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:56,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:58,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 03:14:02,697 INFO [train.py:1039] (0/4) Epoch 17, batch 1750, loss[loss=0.1756, simple_loss=0.2421, pruned_loss=0.05456, over 23591.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.256, pruned_loss=0.05422, over 4713164.89 frames. ], batch size: 256, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:14:04,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:06,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:06,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:14:06,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=578293.3333333334, ans=0.0 2023-09-30 03:14:08,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 03:14:09,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:14:13,059 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.16 vs. limit=15.0 2023-09-30 03:14:13,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:14:14,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:17,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 03:14:20,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:24,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 03:14:24,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:27,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:14:30,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:14:30,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 03:14:31,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:14:31,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 03:14:41,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:14:43,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:14:43,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:48,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:48,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:50,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:14:51,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:54,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:14:54,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:56,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 03:14:59,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:15:01,674 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=578493.3333333334, ans=0.05 2023-09-30 03:15:02,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 03:15:04,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:04,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:05,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:15:09,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:15:09,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 03:15:11,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:12,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:15,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=578560.0, ans=0.0 2023-09-30 03:15:17,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:20,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:21,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:15:23,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 03:15:23,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:24,643 INFO [train.py:1039] (0/4) Epoch 17, batch 1800, loss[loss=0.1927, simple_loss=0.2621, pruned_loss=0.06163, over 23752.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2548, pruned_loss=0.05381, over 4704051.50 frames. ], batch size: 232, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:15:26,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:15:26,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:26,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:15:26,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:15:27,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:15:30,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:15:30,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:32,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:15:34,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:36,401 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.847e+02 2.029e+02 2.247e+02 3.215e+02, threshold=4.058e+02, percent-clipped=0.0 2023-09-30 03:15:36,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:15:39,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:15:41,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:15:44,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:44,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:45,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:15:48,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:48,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 03:15:50,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:54,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:57,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 03:16:01,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 03:16:01,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 03:16:01,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:01,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=578760.0, ans=0.2 2023-09-30 03:16:02,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:16:02,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:04,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:16:11,387 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 03:16:12,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:16:14,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:16,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 03:16:17,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 03:16:17,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:16:18,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:16:20,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:16:26,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 03:16:28,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=578826.6666666666, ans=0.125 2023-09-30 03:16:31,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:16:31,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 03:16:32,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:16:32,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:32,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:16:32,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 03:16:37,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:16:37,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:16:37,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=578893.3333333334, ans=0.125 2023-09-30 03:16:39,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 03:16:39,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:40,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=578893.3333333334, ans=0.125 2023-09-30 03:16:42,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:42,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:16:42,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:44,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:45,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:16:47,373 INFO [train.py:1039] (0/4) Epoch 17, batch 1850, loss[loss=0.1912, simple_loss=0.2641, pruned_loss=0.05921, over 23409.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2558, pruned_loss=0.05415, over 4696644.39 frames. ], batch size: 119, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:16:47,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:47,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:50,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:16:52,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:16:54,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=578960.0, ans=0.0 2023-09-30 03:16:54,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=578960.0, ans=0.04949747468305833 2023-09-30 03:17:01,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:17:01,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 03:17:06,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 03:17:08,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 03:17:12,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:12,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 03:17:12,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 03:17:22,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:17:24,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 03:17:27,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:17:29,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:17:32,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 03:17:34,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:34,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:17:36,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:17:38,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:17:41,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:17:45,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:17:45,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:45,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:17:45,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:48,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:48,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:17:54,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 03:17:54,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:57,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:17:57,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:17:57,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 03:17:57,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 03:18:00,346 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 03:18:00,475 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 03:18:02,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:18:02,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:18:03,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:03,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:03,542 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 03:18:03,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:18:05,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:07,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:18:09,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:18:10,948 INFO [train.py:1039] (0/4) Epoch 17, batch 1900, loss[loss=0.1963, simple_loss=0.2661, pruned_loss=0.06322, over 23403.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2569, pruned_loss=0.05473, over 4699473.60 frames. ], batch size: 134, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:18:11,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:18:11,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 03:18:12,849 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:18:14,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:14,152 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 03:18:14,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:18:15,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:20,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:23,280 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.816e+02 1.988e+02 2.233e+02 2.900e+02, threshold=3.976e+02, percent-clipped=0.0 2023-09-30 03:18:23,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:18:24,945 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 03:18:26,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 03:18:28,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:30,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:18:30,182 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 03:18:30,264 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 03:18:33,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 03:18:35,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:18:40,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 03:18:42,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 03:18:46,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=579426.6666666666, ans=0.1 2023-09-30 03:18:51,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 03:18:55,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 03:18:55,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:55,150 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 03:18:56,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 03:18:56,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 03:18:56,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 03:18:56,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:01,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 03:19:04,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:19:08,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:08,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 03:19:11,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:19:14,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 03:19:14,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:23,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:19:23,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:19:23,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:19:24,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:19:26,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:19:26,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:19:27,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:19:29,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:29,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:19:29,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=579560.0, ans=0.0 2023-09-30 03:19:32,262 INFO [train.py:1039] (0/4) Epoch 17, batch 1950, loss[loss=0.1854, simple_loss=0.2551, pruned_loss=0.05784, over 23571.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2581, pruned_loss=0.05526, over 4707574.26 frames. ], batch size: 149, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:19:32,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:19:32,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:32,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:34,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:37,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:40,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:19:40,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:41,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:19:42,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 03:19:42,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:19:44,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:44,601 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:19:45,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:48,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:19:48,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:19:50,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:52,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:19:55,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:55,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:19:55,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:19:55,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:00,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:03,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:20:03,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:03,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:20:03,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 03:20:04,403 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.44 vs. limit=15.0 2023-09-30 03:20:04,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:20:05,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:20:06,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:11,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:14,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:20:19,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:20:20,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:20:22,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:20:22,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 03:20:22,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:20:24,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=579826.6666666666, ans=0.125 2023-09-30 03:20:26,133 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=579826.6666666666, ans=0.1 2023-09-30 03:20:27,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:20:29,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:20:30,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:38,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:40,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:42,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=579893.3333333334, ans=0.0 2023-09-30 03:20:43,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:44,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:45,130 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=579893.3333333334, ans=0.125 2023-09-30 03:20:48,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:20:48,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:48,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 03:20:48,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:20:50,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:51,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 03:20:52,936 INFO [train.py:1039] (0/4) Epoch 17, batch 2000, loss[loss=0.1855, simple_loss=0.2538, pruned_loss=0.05865, over 23267.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2588, pruned_loss=0.05558, over 4705250.30 frames. ], batch size: 119, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:20:54,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:20:57,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:59,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:20:59,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:01,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:21:03,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:07,549 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.864e+02 2.151e+02 2.440e+02 3.319e+02, threshold=4.303e+02, percent-clipped=0.0 2023-09-30 03:21:07,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 03:21:07,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:21:12,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:21:13,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 03:21:13,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:21:14,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:21:18,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:21:19,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 03:21:21,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=580093.3333333334, ans=0.125 2023-09-30 03:21:26,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 03:21:26,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:21:28,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 03:21:28,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:33,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:21:33,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:21:33,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:33,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:36,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:36,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 03:21:40,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 03:21:40,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:40,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:21:45,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:47,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:21:47,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:48,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:48,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:50,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:50,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:50,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:51,096 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.97 vs. limit=15.0 2023-09-30 03:21:51,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:55,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:57,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 03:22:03,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:22:04,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:22:12,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:15,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:15,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:17,173 INFO [train.py:1039] (0/4) Epoch 17, batch 2050, loss[loss=0.1677, simple_loss=0.2286, pruned_loss=0.05338, over 23331.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.258, pruned_loss=0.05515, over 4710643.72 frames. ], batch size: 285, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:22:17,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:22:17,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:22:18,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:20,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:20,780 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:22:22,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:23,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:23,778 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=580293.3333333334, ans=0.125 2023-09-30 03:22:28,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:22:31,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:22:31,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:33,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:22:33,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 03:22:35,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:22:35,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:22:36,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:22:45,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:45,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:49,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 03:22:52,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:53,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 03:22:53,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:57,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:22:58,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:00,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:23:00,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:23:01,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:23:03,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:23:05,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:23:06,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:08,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:23:12,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:23:13,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:23:16,751 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=580493.3333333334, ans=0.125 2023-09-30 03:23:18,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:22,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:23:23,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 03:23:30,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:30,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:23:33,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:23:35,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 03:23:40,050 INFO [train.py:1039] (0/4) Epoch 17, batch 2100, loss[loss=0.1971, simple_loss=0.2774, pruned_loss=0.05845, over 23747.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2559, pruned_loss=0.05486, over 4692227.89 frames. ], batch size: 85, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:23:40,320 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 03:23:40,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:40,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:41,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:23:41,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:41,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 03:23:43,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 03:23:44,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:48,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:23:48,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:23:51,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:51,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:23:51,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 03:23:53,677 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.830e+02 2.027e+02 2.301e+02 3.593e+02, threshold=4.054e+02, percent-clipped=0.0 2023-09-30 03:23:53,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:23:53,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 03:23:53,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 03:23:56,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:23:56,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:23:56,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 03:23:57,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 03:24:04,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 03:24:04,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:24:07,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:07,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:24:11,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:24:11,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 03:24:13,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:13,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 03:24:13,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 03:24:14,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:14,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 03:24:15,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 03:24:15,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 03:24:16,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:24:19,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:24:21,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:23,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:24,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:26,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:26,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 03:24:28,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:28,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:28,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=580826.6666666666, ans=0.2 2023-09-30 03:24:29,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:29,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 03:24:32,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 03:24:32,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=580826.6666666666, ans=0.0 2023-09-30 03:24:33,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 03:24:36,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:24:39,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:24:39,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 03:24:47,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:48,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:24:50,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:24:50,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:24:50,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:24:51,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:24:51,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:53,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:24:53,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:24:53,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:54,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 03:24:56,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 03:24:56,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:58,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:58,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:25:00,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:25:00,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:25:03,749 INFO [train.py:1039] (0/4) Epoch 17, batch 2150, loss[loss=0.1869, simple_loss=0.2716, pruned_loss=0.05114, over 24439.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2559, pruned_loss=0.05453, over 4696114.80 frames. ], batch size: 69, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:25:05,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:25:07,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:08,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:08,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:25:08,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:10,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:25:13,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:15,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:25:15,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:25:16,028 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:25:16,433 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=12.0 2023-09-30 03:25:20,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:20,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 03:25:21,986 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=581026.6666666666, ans=0.125 2023-09-30 03:25:23,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:24,045 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.08 vs. limit=10.0 2023-09-30 03:25:24,084 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.85 vs. limit=6.0 2023-09-30 03:25:24,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:25:26,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:26,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:26,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:28,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:25:28,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:28,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:25:28,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.min_positive, batch_count=581026.6666666666, ans=0.025 2023-09-30 03:25:29,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:25:29,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=581026.6666666666, ans=0.125 2023-09-30 03:25:31,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 03:25:33,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:25:34,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:34,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:35,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:25:35,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:25:38,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:40,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:25:41,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:41,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 03:25:43,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:25:46,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:46,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:48,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:49,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:25:50,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:52,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:52,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 03:25:52,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=581160.0, ans=0.125 2023-09-30 03:25:53,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 03:25:53,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:25:55,172 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 03:25:55,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:55,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:25:56,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 03:25:56,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:25:56,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 03:25:56,857 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 03:25:56,857 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 03:25:56,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 03:25:59,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:01,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:26:01,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:26:02,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:03,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:26:06,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:06,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:16,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:26:16,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 03:26:19,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:26:24,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:25,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:26:27,024 INFO [train.py:1039] (0/4) Epoch 17, batch 2200, loss[loss=0.1918, simple_loss=0.259, pruned_loss=0.06225, over 23673.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2564, pruned_loss=0.05468, over 4707204.87 frames. ], batch size: 232, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:26:27,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:26:27,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:26:30,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:30,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:26:30,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 03:26:30,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=581293.3333333334, ans=0.2 2023-09-30 03:26:36,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 03:26:39,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:26:41,770 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.818e+02 1.974e+02 2.282e+02 3.535e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 03:26:42,211 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=581360.0, ans=0.125 2023-09-30 03:26:43,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=581360.0, ans=0.0 2023-09-30 03:26:47,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 03:26:50,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:51,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:26:51,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:26:55,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:26:56,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 03:27:00,816 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-09-30 03:27:01,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:27:01,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:03,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 03:27:05,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:27:06,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:08,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:27:10,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:13,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 03:27:14,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:16,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 03:27:19,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:19,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:27:19,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:21,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:27:23,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:23,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:23,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:26,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:27:26,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:27:28,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:27:32,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:27:32,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:27:36,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:27:37,934 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 03:27:40,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:27:40,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 03:27:41,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:27:41,915 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 03:27:44,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:44,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:27:46,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:47,829 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 03:27:49,215 INFO [train.py:1039] (0/4) Epoch 17, batch 2250, loss[loss=0.1625, simple_loss=0.2533, pruned_loss=0.03591, over 24299.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2571, pruned_loss=0.05484, over 4705768.72 frames. ], batch size: 74, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:27:50,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:27:54,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:27:58,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:28:00,401 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.78 vs. limit=6.0 2023-09-30 03:28:01,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:28:03,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:04,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:05,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:28:06,137 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=581693.3333333334, ans=0.1 2023-09-30 03:28:07,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 03:28:07,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:07,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:28:11,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 03:28:11,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:28:12,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:13,159 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.28 vs. limit=15.0 2023-09-30 03:28:15,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:22,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:23,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:28:23,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:28:25,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 03:28:25,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:29,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:28:29,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=581760.0, ans=0.125 2023-09-30 03:28:32,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:34,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:35,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:28:35,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:37,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:40,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:28:45,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:28:50,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:28:55,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:28:55,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:28:58,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:29:03,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:29:05,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:29:05,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 03:29:05,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:06,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:29:09,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 03:29:11,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:29:11,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:12,828 INFO [train.py:1039] (0/4) Epoch 17, batch 2300, loss[loss=0.1929, simple_loss=0.2588, pruned_loss=0.06348, over 23661.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2578, pruned_loss=0.05474, over 4709664.78 frames. ], batch size: 232, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:29:19,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:19,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:29:21,331 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 03:29:22,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:27,889 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.892e+02 2.152e+02 2.503e+02 3.822e+02, threshold=4.305e+02, percent-clipped=0.0 2023-09-30 03:29:28,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:29:29,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:29:29,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:29:29,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:29,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 03:29:31,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:29:32,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:29:34,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:29:38,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:29:42,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:29:46,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:29:49,598 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=582093.3333333334, ans=0.125 2023-09-30 03:29:51,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:29:53,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:57,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:29:59,464 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=582093.3333333334, ans=0.1 2023-09-30 03:30:00,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:04,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:30:04,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:30:06,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:30:06,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 03:30:11,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:30:11,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:12,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:12,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:30:12,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:14,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:30:14,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:30:14,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 03:30:14,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:30:14,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:15,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 03:30:24,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:30:25,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:30:29,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:29,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:30:30,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:30:33,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:30:33,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:30:33,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:30:35,061 INFO [train.py:1039] (0/4) Epoch 17, batch 2350, loss[loss=0.2542, simple_loss=0.3133, pruned_loss=0.09761, over 19997.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2582, pruned_loss=0.05534, over 4717074.26 frames. ], batch size: 388, lr: 6.08e-03, grad_scale: 16.0 2023-09-30 03:30:35,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 03:30:39,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=582293.3333333334, ans=0.1 2023-09-30 03:30:40,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:30:41,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 03:30:47,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 03:30:50,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:53,346 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=582360.0, ans=0.125 2023-09-30 03:30:53,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=582360.0, ans=0.1 2023-09-30 03:30:54,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:30:54,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:56,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 03:30:59,011 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.61 vs. limit=15.0 2023-09-30 03:30:59,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:31:08,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 03:31:09,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:31:12,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:31:12,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:31:14,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:31:16,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 03:31:18,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:31:21,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:31:21,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:21,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:31:24,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:31:26,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 03:31:26,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:31:26,903 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.14 vs. limit=15.0 2023-09-30 03:31:29,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:31:29,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:31:31,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 03:31:31,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:31:36,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 03:31:36,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:31:39,197 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.49 vs. limit=12.0 2023-09-30 03:31:42,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 03:31:47,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 03:31:47,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:47,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 03:31:49,187 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 03:31:49,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 03:31:51,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 03:31:54,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:31:54,860 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=582560.0, ans=0.035 2023-09-30 03:31:56,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=582626.6666666666, ans=0.1 2023-09-30 03:31:57,588 INFO [train.py:1039] (0/4) Epoch 17, batch 2400, loss[loss=0.1915, simple_loss=0.2734, pruned_loss=0.05478, over 23978.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2578, pruned_loss=0.05504, over 4722393.10 frames. ], batch size: 80, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:31:57,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:32:02,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:32:04,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:32:07,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 03:32:07,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 03:32:13,858 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.898e+02 2.064e+02 2.332e+02 3.496e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 03:32:14,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:32:14,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:32:17,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 03:32:18,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:32:18,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:18,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 03:32:25,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=582693.3333333334, ans=0.125 2023-09-30 03:32:27,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:28,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 03:32:30,458 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=582760.0, ans=0.125 2023-09-30 03:32:35,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:32:37,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=582760.0, ans=0.2 2023-09-30 03:32:38,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 03:32:42,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:32:43,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:47,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=582826.6666666666, ans=0.125 2023-09-30 03:32:48,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:32:48,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 03:32:50,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:32:56,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:32:59,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:32:59,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=582826.6666666666, ans=0.05 2023-09-30 03:33:03,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:03,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:33:03,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:33:03,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:33:04,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:04,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:05,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:33:11,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:11,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:33:11,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 03:33:13,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 03:33:16,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:33:16,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:16,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 03:33:18,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 03:33:18,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 03:33:18,488 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 03:33:19,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 03:33:20,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:33:21,460 INFO [train.py:1039] (0/4) Epoch 17, batch 2450, loss[loss=0.1713, simple_loss=0.2509, pruned_loss=0.04583, over 24330.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2563, pruned_loss=0.0547, over 4716397.22 frames. ], batch size: 61, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:33:21,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:23,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:23,776 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 03:33:25,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:25,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:33:28,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:33:28,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:33,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:33,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:35,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 03:33:41,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:33:41,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:42,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:33:44,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:33:44,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:33:44,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 03:33:50,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:51,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:33:53,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:56,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:33:58,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:33:58,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:34:00,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:34:01,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 03:34:03,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:34:11,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:12,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:34:13,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:13,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:34:13,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:14,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:34:14,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 03:34:18,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:34:18,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:34:23,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:34:23,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:28,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:34:28,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 03:34:28,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:34:29,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:34:29,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 03:34:31,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:34:33,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:34:33,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=583226.6666666666, ans=0.0 2023-09-30 03:34:36,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:34:38,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:38,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=583226.6666666666, ans=0.125 2023-09-30 03:34:39,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:34:40,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=583226.6666666666, ans=0.0 2023-09-30 03:34:40,440 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.34 vs. limit=10.0 2023-09-30 03:34:44,640 INFO [train.py:1039] (0/4) Epoch 17, batch 2500, loss[loss=0.2004, simple_loss=0.2836, pruned_loss=0.0586, over 24355.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2555, pruned_loss=0.05447, over 4707908.78 frames. ], batch size: 77, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:34:44,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 03:34:44,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:34:46,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=583293.3333333334, ans=0.125 2023-09-30 03:34:50,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:35:00,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:35:00,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:35:01,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:35:01,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 03:35:03,352 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.839e+02 2.064e+02 2.322e+02 3.484e+02, threshold=4.127e+02, percent-clipped=0.0 2023-09-30 03:35:09,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:35:11,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:11,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:35:11,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:35:11,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 03:35:13,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:14,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:14,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 03:35:14,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:16,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 03:35:16,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:20,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=583426.6666666666, ans=0.0 2023-09-30 03:35:21,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:35:23,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:25,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:35:26,189 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.60 vs. limit=6.0 2023-09-30 03:35:27,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 03:35:27,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:35:29,932 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=15.0 2023-09-30 03:35:30,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:32,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=583426.6666666666, ans=0.125 2023-09-30 03:35:34,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:38,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=583493.3333333334, ans=0.0 2023-09-30 03:35:40,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:43,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:35:47,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:35:48,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=583493.3333333334, ans=0.125 2023-09-30 03:35:52,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 03:35:52,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:52,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:35:54,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:35:54,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:35:55,945 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 03:35:55,945 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 03:35:55,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 03:35:57,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:00,119 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=583560.0, ans=0.1 2023-09-30 03:36:01,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 03:36:01,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 03:36:02,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:36:03,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 03:36:05,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 03:36:06,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=583626.6666666666, ans=10.0 2023-09-30 03:36:08,049 INFO [train.py:1039] (0/4) Epoch 17, batch 2550, loss[loss=0.2063, simple_loss=0.2759, pruned_loss=0.06839, over 23674.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2554, pruned_loss=0.05425, over 4713924.39 frames. ], batch size: 85, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:36:08,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:11,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:36:12,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:36:14,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:15,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 03:36:16,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:36:21,049 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 03:36:22,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:36:24,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:27,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:36:27,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 03:36:29,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:36:29,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:29,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:29,786 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=583693.3333333334, ans=0.125 2023-09-30 03:36:33,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:36:33,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 03:36:33,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:36:33,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:33,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 03:36:47,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:36:49,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=583760.0, ans=0.0 2023-09-30 03:36:52,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:36:53,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:53,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:54,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:37:00,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:37:03,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:37:03,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:37:03,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:37:03,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:37:05,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:37:10,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:10,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:15,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:37:15,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 03:37:15,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:37:16,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:18,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:37:18,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:37:19,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:24,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:37:28,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:31,096 INFO [train.py:1039] (0/4) Epoch 17, batch 2600, loss[loss=0.1897, simple_loss=0.2556, pruned_loss=0.06192, over 22765.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2567, pruned_loss=0.05448, over 4716947.49 frames. ], batch size: 322, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:37:31,296 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 03:37:33,154 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=583960.0, ans=0.125 2023-09-30 03:37:33,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=583960.0, ans=0.0 2023-09-30 03:37:36,331 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 03:37:36,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:37:36,423 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 03:37:36,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 03:37:36,582 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 03:37:37,070 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.81 vs. limit=10.0 2023-09-30 03:37:41,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:41,367 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 03:37:43,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 03:37:44,881 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 03:37:46,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:37:48,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 03:37:49,474 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 2.083e+02 2.505e+02 2.924e+02 4.278e+02, threshold=5.011e+02, percent-clipped=1.0 2023-09-30 03:37:49,912 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=584026.6666666666, ans=0.125 2023-09-30 03:37:51,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 03:37:52,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:37:52,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 03:37:55,952 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 03:37:55,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 03:38:02,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:02,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:04,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:04,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 03:38:07,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:38:12,644 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 03:38:18,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:18,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:18,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 03:38:18,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:18,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:20,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 03:38:21,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:38:23,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:38:25,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:25,572 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.29 vs. limit=12.0 2023-09-30 03:38:28,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=584160.0, ans=0.125 2023-09-30 03:38:29,537 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 03:38:29,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:29,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:38:33,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=584160.0, ans=0.125 2023-09-30 03:38:34,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:34,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=584160.0, ans=0.125 2023-09-30 03:38:36,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:38:36,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 03:38:37,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:39,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:38:39,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:38:46,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 03:38:46,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:50,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:38:53,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 03:38:53,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:55,188 INFO [train.py:1039] (0/4) Epoch 17, batch 2650, loss[loss=0.1886, simple_loss=0.2584, pruned_loss=0.05943, over 23258.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2572, pruned_loss=0.05472, over 4714941.03 frames. ], batch size: 105, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:38:55,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:38:55,379 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 03:38:55,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:38:58,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:39:01,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:39:02,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:39:03,184 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=584293.3333333334, ans=0.0 2023-09-30 03:39:05,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:39:07,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 03:39:07,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:39:08,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:39:11,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 03:39:12,939 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 03:39:13,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=584360.0, ans=0.125 2023-09-30 03:39:13,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=584360.0, ans=0.125 2023-09-30 03:39:15,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:20,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 03:39:20,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:20,224 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=584360.0, ans=0.125 2023-09-30 03:39:20,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=584360.0, ans=0.0 2023-09-30 03:39:21,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 03:39:27,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:39:27,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:30,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 03:39:30,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 03:39:33,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:39:38,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 03:39:38,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:41,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:41,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:39:41,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:42,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:42,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:45,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:47,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:39:49,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:39:51,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:39:52,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:53,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:39:53,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:55,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:56,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:39:57,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=584493.3333333334, ans=0.1 2023-09-30 03:39:58,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:58,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:39:58,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:40:00,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 03:40:02,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:05,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:08,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:40:09,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:12,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:12,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 03:40:15,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:40:17,474 INFO [train.py:1039] (0/4) Epoch 17, batch 2700, loss[loss=0.1766, simple_loss=0.2549, pruned_loss=0.04912, over 24671.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2584, pruned_loss=0.05492, over 4711372.08 frames. ], batch size: 65, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:40:17,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 03:40:18,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=584626.6666666666, ans=0.2 2023-09-30 03:40:19,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:40:20,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:20,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:22,386 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.31 vs. limit=15.0 2023-09-30 03:40:22,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:40:22,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:40:22,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:40:22,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:40:22,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 03:40:24,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:40:26,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:40:27,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:40:27,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:31,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:40:32,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 03:40:34,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:40:36,034 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.817e+02 2.004e+02 2.215e+02 2.992e+02, threshold=4.008e+02, percent-clipped=0.0 2023-09-30 03:40:40,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:40:40,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:40:45,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:40:45,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:47,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:40:47,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:40:50,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:40:50,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=584760.0, ans=0.125 2023-09-30 03:40:50,874 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.88 vs. limit=10.0 2023-09-30 03:40:53,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:53,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:40:53,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:40:57,074 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=584760.0, ans=0.125 2023-09-30 03:40:58,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:58,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:40:59,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=584760.0, ans=0.125 2023-09-30 03:41:07,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:41:09,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:41:12,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:41:12,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:14,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=584826.6666666666, ans=0.2 2023-09-30 03:41:18,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:18,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:18,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:41:18,572 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=584826.6666666666, ans=0.125 2023-09-30 03:41:19,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:21,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:22,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:41:24,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:41:27,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:27,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:30,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 03:41:32,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:34,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:41:34,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 03:41:36,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 03:41:37,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:38,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=584960.0, ans=0.1 2023-09-30 03:41:39,335 INFO [train.py:1039] (0/4) Epoch 17, batch 2750, loss[loss=0.1685, simple_loss=0.2253, pruned_loss=0.05584, over 23387.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2576, pruned_loss=0.05479, over 4713173.22 frames. ], batch size: 285, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:41:40,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:41:41,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:45,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:45,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:41:47,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:49,174 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=584960.0, ans=0.0 2023-09-30 03:41:50,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:41:50,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:41:52,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:41:52,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:52,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 03:41:52,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:41:52,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:57,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 03:42:00,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:42:00,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:00,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:00,522 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=585026.6666666666, ans=0.0 2023-09-30 03:42:01,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:42:01,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:42:03,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:42:03,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:05,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:10,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:42:12,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:42:12,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:42:13,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:15,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:42:21,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:25,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:42:25,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:28,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:28,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:42:29,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:42:35,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:42:35,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:35,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 03:42:40,226 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.10 vs. limit=15.0 2023-09-30 03:42:42,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:44,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 03:42:50,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:42:53,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:42:53,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 03:42:54,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:42:56,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:42:56,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 03:42:57,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:42:58,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:43:00,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:00,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:00,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 03:43:00,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:01,825 INFO [train.py:1039] (0/4) Epoch 17, batch 2800, loss[loss=0.1823, simple_loss=0.2691, pruned_loss=0.0478, over 24560.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2554, pruned_loss=0.05447, over 4716967.97 frames. ], batch size: 71, lr: 6.07e-03, grad_scale: 16.0 2023-09-30 03:43:01,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:03,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:04,935 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 03:43:04,937 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 03:43:08,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:10,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:43:11,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:43:15,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:43:18,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 03:43:20,141 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.888e+02 2.133e+02 2.598e+02 4.037e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 03:43:20,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:43:21,319 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.50 vs. limit=15.0 2023-09-30 03:43:22,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 03:43:23,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:24,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:43:24,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:28,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:29,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:29,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:43:29,862 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=585360.0, ans=0.05 2023-09-30 03:43:31,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:43:39,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:43:39,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=585426.6666666666, ans=0.2 2023-09-30 03:43:42,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:45,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:45,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:43:46,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:52,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:43:52,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 03:43:52,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:54,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:54,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:43:54,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=585493.3333333334, ans=0.1 2023-09-30 03:43:59,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:59,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:04,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:44:04,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=585493.3333333334, ans=0.125 2023-09-30 03:44:06,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:44:07,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:07,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:44:07,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:44:09,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:44:10,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:44:10,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 03:44:10,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:11,306 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.42 vs. limit=15.0 2023-09-30 03:44:12,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:44:12,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:15,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 03:44:15,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:16,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:44:16,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:44:18,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 03:44:23,657 INFO [train.py:1039] (0/4) Epoch 17, batch 2850, loss[loss=0.1874, simple_loss=0.2582, pruned_loss=0.05829, over 23303.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2546, pruned_loss=0.05428, over 4700880.02 frames. ], batch size: 105, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:44:23,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:44:23,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:44:25,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:44:26,776 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.16 vs. limit=15.0 2023-09-30 03:44:28,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:32,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:44:32,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:44:32,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:44:35,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:36,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:37,191 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=585626.6666666666, ans=0.2 2023-09-30 03:44:38,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:44:40,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 03:44:45,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 03:44:45,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:48,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 03:44:49,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:51,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 03:44:52,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 03:44:54,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:06,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:06,925 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=585760.0, ans=0.1 2023-09-30 03:45:08,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:08,899 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=585760.0, ans=0.1 2023-09-30 03:45:10,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:45:11,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:45:11,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:45:11,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:45:14,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:45:14,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 03:45:16,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:45:18,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:18,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:18,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:21,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:21,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:23,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:23,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:26,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:45:26,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:26,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=585826.6666666666, ans=0.125 2023-09-30 03:45:27,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:31,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:45:35,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:45:36,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 03:45:38,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 03:45:39,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:45:41,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:41,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 03:45:41,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:45:41,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:41,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:43,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:45:43,489 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 03:45:43,568 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 03:45:43,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:45:43,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:46,536 INFO [train.py:1039] (0/4) Epoch 17, batch 2900, loss[loss=0.1508, simple_loss=0.218, pruned_loss=0.04176, over 24433.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2538, pruned_loss=0.05393, over 4700370.01 frames. ], batch size: 58, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:45:50,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:45:50,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:50,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:53,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 03:45:56,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:56,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 03:45:57,437 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.64 vs. limit=15.0 2023-09-30 03:45:58,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 03:45:59,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:45:59,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:46:00,151 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:46:01,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:03,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:46:06,435 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.832e+02 2.093e+02 2.427e+02 4.261e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 03:46:06,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:46:08,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:46:11,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:46:11,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 03:46:13,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:46:13,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:16,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 03:46:18,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 03:46:21,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:46:21,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 03:46:21,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:46:24,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:46:24,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:46:26,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:27,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:32,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:46:35,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:46:38,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 03:46:38,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 03:46:38,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:46:42,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:46:44,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 03:46:45,236 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=15.0 2023-09-30 03:46:47,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:46:50,717 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=586226.6666666666, ans=0.2 2023-09-30 03:46:51,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:47:02,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:47:02,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:47:02,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 03:47:02,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=586226.6666666666, ans=0.125 2023-09-30 03:47:05,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:05,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 03:47:07,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:07,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:47:08,598 INFO [train.py:1039] (0/4) Epoch 17, batch 2950, loss[loss=0.196, simple_loss=0.2773, pruned_loss=0.05733, over 23759.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2547, pruned_loss=0.05378, over 4719995.78 frames. ], batch size: 85, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:47:09,491 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-09-30 03:47:15,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:15,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 03:47:17,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:17,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:19,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:47:20,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:47:20,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 03:47:22,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 03:47:22,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:47:22,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:27,696 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=586360.0, ans=0.0 2023-09-30 03:47:30,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:33,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:47:35,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:47:36,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:37,184 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=586360.0, ans=0.2 2023-09-30 03:47:38,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:47:38,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:47:40,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=586426.6666666666, ans=0.125 2023-09-30 03:47:41,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:47:44,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 03:47:50,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 03:47:50,063 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 03:47:51,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:47:53,613 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 03:47:55,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 03:47:55,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:55,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:55,258 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 03:47:55,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:47:58,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 03:47:59,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:48:00,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:48:02,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:03,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:48:05,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:05,784 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 03:48:05,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:05,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 03:48:06,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=586493.3333333334, ans=0.0 2023-09-30 03:48:12,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=586493.3333333334, ans=0.0 2023-09-30 03:48:13,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:15,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:48:16,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 03:48:16,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:48:18,259 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff2.min_abs, batch_count=586560.0, ans=0.1 2023-09-30 03:48:19,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 03:48:21,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:23,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:48:24,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:48:26,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:26,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:48:27,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:48:27,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:27,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:48:30,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:48:31,979 INFO [train.py:1039] (0/4) Epoch 17, batch 3000, loss[loss=0.1978, simple_loss=0.2638, pruned_loss=0.06589, over 23719.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2559, pruned_loss=0.05375, over 4719258.90 frames. ], batch size: 212, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:48:31,980 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 03:48:47,114 INFO [train.py:1071] (0/4) Epoch 17, validation: loss=0.2916, simple_loss=0.2691, pruned_loss=0.1571, over 1125622.00 frames. 2023-09-30 03:48:47,115 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 03:48:47,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:48,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:48:49,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=586626.6666666666, ans=0.2 2023-09-30 03:48:50,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:50,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 03:48:51,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:53,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:48:55,062 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-88000.pt 2023-09-30 03:48:58,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:49:01,627 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 03:49:01,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 03:49:05,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:49:05,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:49:06,687 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.67 vs. limit=6.0 2023-09-30 03:49:07,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 03:49:07,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:10,633 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.835e+02 1.995e+02 2.224e+02 3.286e+02, threshold=3.989e+02, percent-clipped=0.0 2023-09-30 03:49:15,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:49:25,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:49:31,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 03:49:33,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:49:34,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:49:36,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:38,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:49:38,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:38,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 03:49:42,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 03:49:42,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:49:43,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:49:47,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:49:47,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:47,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:49:47,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:49:50,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:49:50,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:50,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:49:53,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:57,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 03:49:57,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:49:58,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:49:58,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:50:02,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:02,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:03,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:50:05,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 03:50:05,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:06,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 03:50:06,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:50:08,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 03:50:13,264 INFO [train.py:1039] (0/4) Epoch 17, batch 3050, loss[loss=0.1674, simple_loss=0.2427, pruned_loss=0.04603, over 24634.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2563, pruned_loss=0.05386, over 4724490.27 frames. ], batch size: 60, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:50:13,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:13,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:50:13,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 03:50:15,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 03:50:15,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:50:16,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:50:16,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:18,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:50:18,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:18,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:50:21,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 03:50:23,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:50:26,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:26,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:50:31,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:34,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 03:50:36,927 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.38 vs. limit=15.0 2023-09-30 03:50:39,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 03:50:39,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 03:50:40,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:50:43,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:50:49,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:49,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:51,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:52,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:50:54,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:54,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:54,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:54,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:55,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:58,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:00,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:00,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 03:51:00,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:51:02,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:51:05,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:51:06,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:51:07,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:07,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:10,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=587160.0, ans=0.0 2023-09-30 03:51:12,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:51:12,675 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.89 vs. limit=15.0 2023-09-30 03:51:13,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:13,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=587160.0, ans=0.125 2023-09-30 03:51:20,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:22,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:51:22,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:22,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:23,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:51:23,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:51:25,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 03:51:27,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:27,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:27,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 03:51:30,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:33,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:35,806 INFO [train.py:1039] (0/4) Epoch 17, batch 3100, loss[loss=0.1874, simple_loss=0.2751, pruned_loss=0.04988, over 24093.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2566, pruned_loss=0.05414, over 4725371.17 frames. ], batch size: 80, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:51:35,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:51:37,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:51:40,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 03:51:43,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 03:51:45,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 03:51:45,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:51:48,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:48,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:53,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:51:55,413 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.922e+02 2.326e+02 2.796e+02 3.777e+02, threshold=4.651e+02, percent-clipped=0.0 2023-09-30 03:51:55,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:02,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 03:52:05,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:52:07,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:08,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:08,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:52:10,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:52:12,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:52:12,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 03:52:12,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:52:15,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:15,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 03:52:18,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:52:20,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:52:21,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 03:52:23,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 03:52:25,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:25,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:28,054 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.20 vs. limit=15.0 2023-09-30 03:52:28,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:28,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:30,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:52:32,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:52:32,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:52:33,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:52:33,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:52:33,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:33,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 03:52:37,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:38,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 03:52:41,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:52:43,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 03:52:43,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:45,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:45,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 03:52:56,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 03:52:58,043 INFO [train.py:1039] (0/4) Epoch 17, batch 3150, loss[loss=0.1639, simple_loss=0.2108, pruned_loss=0.05852, over 18940.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2548, pruned_loss=0.05391, over 4715977.62 frames. ], batch size: 388, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:52:58,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:00,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:03,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:53:03,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:53:04,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 03:53:04,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:06,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:53:07,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 03:53:09,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=587626.6666666666, ans=0.125 2023-09-30 03:53:10,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:10,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=587626.6666666666, ans=0.125 2023-09-30 03:53:11,747 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 03:53:14,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 03:53:14,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:53:16,380 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 03:53:16,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:53:18,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 03:53:18,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 03:53:18,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 03:53:18,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:18,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:20,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:22,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 03:53:24,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:28,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:53:31,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 03:53:31,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:53:36,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:53:36,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:38,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 03:53:42,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 03:53:42,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:53:42,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:53:43,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:53:43,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:43,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:53:45,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:53:45,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:53:45,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 03:53:45,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=587760.0, ans=0.0 2023-09-30 03:53:46,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:53:47,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:50,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:53:50,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:50,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 03:53:52,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:53:55,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 03:53:55,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:55,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 03:53:56,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 03:53:58,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:53:59,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:53:59,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 03:54:00,621 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.18 vs. limit=10.0 2023-09-30 03:54:01,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:54:02,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:54:05,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=587893.3333333334, ans=0.035 2023-09-30 03:54:07,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:54:08,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:08,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:54:14,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:54:16,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:17,135 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.04 vs. limit=15.0 2023-09-30 03:54:18,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:54:21,675 INFO [train.py:1039] (0/4) Epoch 17, batch 3200, loss[loss=0.1828, simple_loss=0.2546, pruned_loss=0.05544, over 23201.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2542, pruned_loss=0.05362, over 4716459.90 frames. ], batch size: 105, lr: 6.06e-03, grad_scale: 16.0 2023-09-30 03:54:22,730 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.77 vs. limit=15.0 2023-09-30 03:54:23,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:54:23,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:54:23,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=587960.0, ans=0.09899494936611666 2023-09-30 03:54:26,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:28,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:54:28,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 03:54:28,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=587960.0, ans=0.125 2023-09-30 03:54:31,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:54:37,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:54:40,745 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.992e+02 2.258e+02 2.770e+02 4.284e+02, threshold=4.516e+02, percent-clipped=0.0 2023-09-30 03:54:40,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:49,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:54:54,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=588093.3333333334, ans=0.2 2023-09-30 03:55:00,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 03:55:00,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:55:02,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=588093.3333333334, ans=0.125 2023-09-30 03:55:04,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 03:55:05,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:55:08,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:55:08,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:55:10,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:55:15,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 03:55:17,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:55:20,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 03:55:23,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 03:55:25,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:55:30,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:55:31,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,887 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 03:55:31,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:55:35,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:55:36,188 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.61 vs. limit=15.0 2023-09-30 03:55:37,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 03:55:37,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 03:55:37,816 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=588226.6666666666, ans=0.125 2023-09-30 03:55:38,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 03:55:40,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 03:55:41,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:55:43,575 INFO [train.py:1039] (0/4) Epoch 17, batch 3250, loss[loss=0.2007, simple_loss=0.2643, pruned_loss=0.0685, over 23706.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2546, pruned_loss=0.05404, over 4698141.90 frames. ], batch size: 164, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:55:43,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:55:43,864 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 03:55:43,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:55:45,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:55:46,751 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 03:55:50,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:55:53,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:56:01,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:01,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 03:56:03,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:04,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:04,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:04,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:04,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:56:08,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:08,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:56:08,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:10,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:10,327 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=588360.0, ans=0.1 2023-09-30 03:56:13,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:14,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:17,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:17,659 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:20,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:20,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:20,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:25,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 03:56:26,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:56:26,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:56:26,850 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=588426.6666666666, ans=0.0 2023-09-30 03:56:27,243 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.16 vs. limit=15.0 2023-09-30 03:56:28,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:28,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:56:36,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:56:42,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:56:42,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:42,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 03:56:42,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:56:42,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:56:42,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:47,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 03:56:47,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 03:56:49,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:50,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:52,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:52,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:56:52,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:57,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:57,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:59,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 03:56:59,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:02,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:57:02,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 03:57:05,655 INFO [train.py:1039] (0/4) Epoch 17, batch 3300, loss[loss=0.1916, simple_loss=0.2627, pruned_loss=0.06024, over 23724.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2559, pruned_loss=0.05408, over 4699500.79 frames. ], batch size: 135, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:57:05,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:57:05,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 03:57:09,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 03:57:11,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 03:57:11,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:11,259 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:57:15,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:57:17,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:57:17,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:17,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:57:18,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:57:22,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:23,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:57:25,151 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.863e+02 2.079e+02 2.237e+02 3.389e+02, threshold=4.158e+02, percent-clipped=0.0 2023-09-30 03:57:26,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 03:57:28,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:28,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:29,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:30,535 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 03:57:30,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:57:32,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:57:33,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:57:33,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:57:33,647 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 03:57:38,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:38,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:57:40,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:40,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 03:57:42,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 03:57:42,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:44,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:57:45,732 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 03:57:45,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 03:57:47,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:57:50,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 03:57:52,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:57:54,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:57:54,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:57:57,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:58,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:58,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:58,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:58:01,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:58:02,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:02,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:58:05,427 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 03:58:05,863 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=588826.6666666666, ans=0.125 2023-09-30 03:58:06,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 03:58:08,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:58:09,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:09,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:12,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:58:12,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:13,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:58:15,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:15,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:58:15,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:18,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:58:21,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 03:58:21,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:23,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:25,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:58:25,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:58:26,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:28,049 INFO [train.py:1039] (0/4) Epoch 17, batch 3350, loss[loss=0.1853, simple_loss=0.2648, pruned_loss=0.05294, over 23365.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2569, pruned_loss=0.05422, over 4708428.87 frames. ], batch size: 93, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:58:30,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:30,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:33,502 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=588960.0, ans=0.125 2023-09-30 03:58:34,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:58:34,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:36,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:58:39,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:41,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:58:43,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:43,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:58:44,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 03:58:46,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 03:58:46,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:51,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 03:58:51,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 03:58:53,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:58:53,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:58:54,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:54,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 03:58:54,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:54,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:58:56,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:59,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:59,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:01,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:59:03,392 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=589093.3333333334, ans=0.0 2023-09-30 03:59:04,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:06,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:07,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:12,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:59:14,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:14,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:14,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:17,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:20,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 03:59:20,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:59:20,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 03:59:20,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:59:22,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=589160.0, ans=0.125 2023-09-30 03:59:24,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 03:59:25,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:27,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:35,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:35,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 03:59:36,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:59:37,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:59:39,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:59:42,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:59:46,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 03:59:46,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:59:46,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:59:49,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:49,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 03:59:49,895 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.64 vs. limit=10.0 2023-09-30 03:59:50,548 INFO [train.py:1039] (0/4) Epoch 17, batch 3400, loss[loss=0.1852, simple_loss=0.27, pruned_loss=0.05015, over 24308.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2573, pruned_loss=0.05429, over 4712191.76 frames. ], batch size: 74, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:59:50,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:50,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 03:59:52,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:52,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:53,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:59:55,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:59:55,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 03:59:59,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 03:59:59,194 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 03:59:59,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:04,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:00:04,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:00:05,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:07,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:00:07,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=589360.0, ans=0.0 2023-09-30 04:00:10,137 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.964e+02 2.182e+02 2.544e+02 4.408e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 04:00:13,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:17,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 04:00:22,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:00:25,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:25,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:26,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:00:32,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:00:37,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 04:00:38,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=589493.3333333334, ans=0.05 2023-09-30 04:00:43,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:44,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:44,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 04:00:46,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:00:46,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:46,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:47,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:00:51,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:55,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:00:55,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:01:00,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:01,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=589560.0, ans=0.07 2023-09-30 04:01:02,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 04:01:07,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:01:12,418 INFO [train.py:1039] (0/4) Epoch 17, batch 3450, loss[loss=0.1628, simple_loss=0.2399, pruned_loss=0.04281, over 24290.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.257, pruned_loss=0.05405, over 4715088.37 frames. ], batch size: 56, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:01:14,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 04:01:19,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 04:01:19,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:01:20,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:01:20,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 04:01:22,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:27,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:01:32,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:01:33,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:33,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:01:33,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:35,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:41,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 04:01:45,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=589760.0, ans=0.1 2023-09-30 04:01:48,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 04:01:48,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:01:48,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:01:50,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:56,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 04:01:57,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:02:00,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:00,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:02:01,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:02:04,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:02:06,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 04:02:06,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:06,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=589826.6666666666, ans=0.125 2023-09-30 04:02:07,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:02:09,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:02:11,407 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=589826.6666666666, ans=0.1 2023-09-30 04:02:13,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 04:02:17,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:02:22,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:02:23,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:23,664 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.95 vs. limit=15.0 2023-09-30 04:02:26,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:31,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:31,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:33,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:02:33,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:34,848 INFO [train.py:1039] (0/4) Epoch 17, batch 3500, loss[loss=0.1899, simple_loss=0.271, pruned_loss=0.05441, over 24628.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.256, pruned_loss=0.05454, over 4697252.41 frames. ], batch size: 68, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:02:38,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:41,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:02:42,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 04:02:43,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=589960.0, ans=0.0 2023-09-30 04:02:45,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:02:47,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:02:52,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:52,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 04:02:53,689 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.944e+02 2.191e+02 2.533e+02 4.328e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-30 04:02:56,048 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=590026.6666666666, ans=0.125 2023-09-30 04:02:57,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:02:59,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:03:00,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:03:00,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:00,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:03:00,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:00,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:00,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=590026.6666666666, ans=0.2 2023-09-30 04:03:02,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 04:03:05,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:05,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:03:07,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=590093.3333333334, ans=0.1 2023-09-30 04:03:07,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=590093.3333333334, ans=0.0 2023-09-30 04:03:08,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:10,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:11,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 04:03:12,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:13,831 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=590093.3333333334, ans=0.125 2023-09-30 04:03:15,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:17,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:03:17,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:19,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:03:19,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:21,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 04:03:21,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 04:03:21,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 04:03:22,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:24,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:26,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:26,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:03:30,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:03:30,811 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.76 vs. limit=15.0 2023-09-30 04:03:31,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:03:36,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:03:38,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 04:03:38,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 04:03:38,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:03:42,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:42,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:44,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:47,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 04:03:48,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:48,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:50,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 04:03:53,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 04:03:54,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:56,841 INFO [train.py:1039] (0/4) Epoch 17, batch 3550, loss[loss=0.1681, simple_loss=0.2135, pruned_loss=0.06132, over 19118.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2544, pruned_loss=0.05397, over 4691307.42 frames. ], batch size: 389, lr: 6.04e-03, grad_scale: 8.0 2023-09-30 04:03:56,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:56,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:03:57,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:02,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:04:07,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=590293.3333333334, ans=0.125 2023-09-30 04:04:08,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=590293.3333333334, ans=0.125 2023-09-30 04:04:09,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:13,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:04:14,025 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.42 vs. limit=15.0 2023-09-30 04:04:16,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:18,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:04:21,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:21,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:04:21,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:04:26,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:26,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:04:26,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:26,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:04:27,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:04:33,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:04:33,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:34,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:34,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:34,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:04:34,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 04:04:34,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:35,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=590426.6666666666, ans=0.125 2023-09-30 04:04:37,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:37,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 04:04:37,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=590426.6666666666, ans=0.2 2023-09-30 04:04:43,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:45,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:47,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:48,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 04:04:50,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:04:51,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 04:04:52,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:54,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:04:56,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:04:57,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 04:04:58,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:04,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:04,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 04:05:06,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:11,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:05:13,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 04:05:17,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 04:05:19,810 INFO [train.py:1039] (0/4) Epoch 17, batch 3600, loss[loss=0.1719, simple_loss=0.2484, pruned_loss=0.0477, over 24622.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2543, pruned_loss=0.05361, over 4703448.32 frames. ], batch size: 60, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:05:19,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:05:21,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:05:23,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:24,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:25,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:05:30,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:31,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:31,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=590626.6666666666, ans=0.125 2023-09-30 04:05:32,310 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.83 vs. limit=22.5 2023-09-30 04:05:33,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:05:33,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:05:33,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:33,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 04:05:38,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:05:39,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:41,045 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.895e+02 2.111e+02 2.493e+02 3.633e+02, threshold=4.223e+02, percent-clipped=0.0 2023-09-30 04:05:43,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:44,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:05:46,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:05:47,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:47,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 04:05:49,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:52,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:53,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:05:57,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:59,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:06:00,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:02,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 04:06:10,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:10,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:06:11,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 04:06:15,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:06:20,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:21,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=590826.6666666666, ans=0.1 2023-09-30 04:06:23,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:29,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:06:29,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:06:29,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 04:06:33,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 04:06:35,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 04:06:36,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:38,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:06:39,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 04:06:39,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:06:40,261 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:06:40,496 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.78 vs. limit=10.0 2023-09-30 04:06:41,267 INFO [train.py:1039] (0/4) Epoch 17, batch 3650, loss[loss=0.1593, simple_loss=0.2319, pruned_loss=0.04333, over 24326.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2548, pruned_loss=0.05381, over 4702237.35 frames. ], batch size: 56, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:06:41,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:06:41,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:41,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 04:06:42,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 04:06:46,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:47,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 04:06:52,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 04:06:55,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:06:59,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 04:07:00,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 04:07:05,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:05,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:07:06,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:07:10,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 04:07:10,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:07:12,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 04:07:13,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:07:13,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:13,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 04:07:15,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:07:16,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:16,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:18,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:07:19,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 04:07:21,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 04:07:21,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:07:24,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 04:07:24,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:24,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:07:31,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:07:33,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:33,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:07:34,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:07:35,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:07:38,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:07:39,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=591160.0, ans=0.125 2023-09-30 04:07:41,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:41,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=591160.0, ans=0.125 2023-09-30 04:07:43,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:43,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:44,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:07:44,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:45,656 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.53 vs. limit=15.0 2023-09-30 04:07:46,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:46,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=591226.6666666666, ans=0.1 2023-09-30 04:07:51,381 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 04:07:53,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=591226.6666666666, ans=0.125 2023-09-30 04:07:54,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:54,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:56,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:07:56,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:07:58,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:07:58,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:58,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=591226.6666666666, ans=0.0 2023-09-30 04:07:59,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 04:07:59,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:03,234 INFO [train.py:1039] (0/4) Epoch 17, batch 3700, loss[loss=0.2592, simple_loss=0.309, pruned_loss=0.1047, over 19451.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2558, pruned_loss=0.05409, over 4702918.86 frames. ], batch size: 388, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:08:03,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:08:04,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:08:05,336 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.36 vs. limit=15.0 2023-09-30 04:08:06,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:08:10,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:10,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 04:08:10,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:10,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:08:10,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:08:16,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:08:20,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:08:20,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:22,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:08:23,657 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.066e+02 2.372e+02 2.822e+02 4.453e+02, threshold=4.744e+02, percent-clipped=1.0 2023-09-30 04:08:23,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:23,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:08:25,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:25,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=591360.0, ans=0.0 2023-09-30 04:08:27,449 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.33 vs. limit=22.5 2023-09-30 04:08:29,082 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 04:08:35,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:08:35,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:08:35,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:08:35,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 04:08:37,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:40,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:42,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 04:08:42,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:45,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:08:47,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:47,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:08:49,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:08:52,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:53,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 04:08:54,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:55,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 04:08:59,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=591493.3333333334, ans=0.0 2023-09-30 04:09:02,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:09:02,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:09:05,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:06,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 04:09:08,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:09:08,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:09:08,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:08,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:13,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:15,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 04:09:16,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 04:09:16,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:09:16,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:19,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:09:19,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:09:22,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:09:22,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:09:24,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:09:25,109 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.05 vs. limit=12.0 2023-09-30 04:09:25,784 INFO [train.py:1039] (0/4) Epoch 17, batch 3750, loss[loss=0.1774, simple_loss=0.242, pruned_loss=0.05638, over 23516.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2576, pruned_loss=0.05479, over 4697529.89 frames. ], batch size: 134, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:09:26,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 04:09:27,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=591626.6666666666, ans=0.04949747468305833 2023-09-30 04:09:28,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:09:31,042 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.74 vs. limit=15.0 2023-09-30 04:09:31,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:09:31,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 04:09:33,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:09:34,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:35,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:38,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:09:41,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:09:46,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:09:47,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:09:49,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:52,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:09:53,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 04:09:55,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:09:56,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:09:56,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:09:59,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 04:10:06,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 04:10:06,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:10:06,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:10:06,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=591760.0, ans=0.2 2023-09-30 04:10:09,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:14,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:15,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:10:22,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 04:10:25,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:26,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:10:28,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:10:30,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=591893.3333333334, ans=0.0 2023-09-30 04:10:31,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:10:36,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:10:38,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:10:40,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:10:41,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:10:44,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:10:46,643 INFO [train.py:1039] (0/4) Epoch 17, batch 3800, loss[loss=0.1686, simple_loss=0.2552, pruned_loss=0.04107, over 24653.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2584, pruned_loss=0.05494, over 4701195.36 frames. ], batch size: 68, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:10:49,128 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.06 vs. limit=15.0 2023-09-30 04:10:52,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:10:57,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:58,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:11:00,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 04:11:01,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:03,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:05,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:11:06,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:11:06,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:06,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:11:08,672 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.935e+02 2.193e+02 2.576e+02 3.771e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-30 04:11:10,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:11,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:11:11,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:11,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 04:11:15,250 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=592026.6666666666, ans=0.0 2023-09-30 04:11:16,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:11:16,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:11:18,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:20,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:11:20,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:11:23,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:11:23,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:26,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:28,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:34,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:11:34,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 04:11:35,025 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=592160.0, ans=0.0 2023-09-30 04:11:36,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:43,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:11:44,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=592160.0, ans=0.125 2023-09-30 04:11:48,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:11:50,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 04:11:52,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 04:11:53,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:55,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:55,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:58,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 04:12:01,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 04:12:01,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 04:12:01,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:03,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:12:08,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:12:10,434 INFO [train.py:1039] (0/4) Epoch 17, batch 3850, loss[loss=0.1658, simple_loss=0.2377, pruned_loss=0.04694, over 23524.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2574, pruned_loss=0.05457, over 4713604.36 frames. ], batch size: 134, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:12:10,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:12:15,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:12:17,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=592293.3333333334, ans=0.125 2023-09-30 04:12:18,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 04:12:18,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:12:20,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:23,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:12:27,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:30,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:12:30,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 04:12:34,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:37,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:40,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:40,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:12:42,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:43,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:12:44,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:44,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:12:45,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:49,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:50,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:50,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:12:52,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 04:12:52,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 04:12:53,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:55,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:57,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:12:57,718 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.12 vs. limit=22.5 2023-09-30 04:12:58,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:59,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 04:13:02,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 04:13:03,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:05,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 04:13:08,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:13:11,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:13,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:13:18,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:18,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 04:13:20,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 04:13:23,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:23,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:26,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:13:26,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:13:27,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:13:27,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 04:13:29,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:13:29,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 04:13:29,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:29,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:31,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:13:31,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:32,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:13:33,362 INFO [train.py:1039] (0/4) Epoch 17, batch 3900, loss[loss=0.1768, simple_loss=0.2381, pruned_loss=0.05778, over 23483.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2555, pruned_loss=0.05411, over 4707627.75 frames. ], batch size: 285, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:13:33,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:33,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:35,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:13:35,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 04:13:35,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:38,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:39,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:39,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:13:41,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:44,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:44,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:48,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:13:49,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 04:13:49,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:13:51,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 04:13:51,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:53,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 04:13:54,483 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.932e+02 2.197e+02 2.548e+02 3.814e+02, threshold=4.393e+02, percent-clipped=0.0 2023-09-30 04:13:54,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 04:13:59,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:03,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:14:03,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:14:04,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:06,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=592760.0, ans=0.1 2023-09-30 04:14:07,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:09,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:14:11,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:14:11,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:12,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:14:19,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:19,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:14:28,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:14:30,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:14:34,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=592826.6666666666, ans=0.125 2023-09-30 04:14:37,190 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=592893.3333333334, ans=0.125 2023-09-30 04:14:39,319 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=592893.3333333334, ans=0.1 2023-09-30 04:14:40,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:14:43,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:45,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 04:14:45,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 04:14:45,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:46,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 04:14:48,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:49,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 04:14:54,781 INFO [train.py:1039] (0/4) Epoch 17, batch 3950, loss[loss=0.1593, simple_loss=0.2392, pruned_loss=0.03969, over 24283.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2552, pruned_loss=0.05407, over 4689962.23 frames. ], batch size: 61, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:14:58,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:58,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 04:15:00,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:15:00,606 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.07 vs. limit=12.0 2023-09-30 04:15:03,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:15:04,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:15:11,390 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 04:15:11,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:13,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 04:15:13,646 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 04:15:15,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:18,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:18,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:15:19,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:22,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 04:15:24,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:15:24,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:24,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:15:25,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:15:25,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:15:28,451 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=593093.3333333334, ans=0.2 2023-09-30 04:15:36,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:15:36,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:15:43,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 04:15:46,906 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:15:48,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 04:15:48,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 04:15:48,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:15:48,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:15:52,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=593160.0, ans=0.125 2023-09-30 04:15:53,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=593160.0, ans=0.0 2023-09-30 04:15:56,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:15:56,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:15:56,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:57,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:15:57,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 04:16:05,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:16:06,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:16:08,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=593226.6666666666, ans=0.1 2023-09-30 04:16:08,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=593226.6666666666, ans=0.0 2023-09-30 04:16:11,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 04:16:18,000 INFO [train.py:1039] (0/4) Epoch 17, batch 4000, loss[loss=0.1669, simple_loss=0.2408, pruned_loss=0.04652, over 24416.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2563, pruned_loss=0.05444, over 4683582.45 frames. ], batch size: 58, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:16:19,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:24,214 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.67 vs. limit=15.0 2023-09-30 04:16:28,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:32,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:34,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:16:34,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:34,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 04:16:36,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:16:38,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 04:16:38,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:16:38,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 04:16:40,235 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.856e+02 2.093e+02 2.279e+02 3.915e+02, threshold=4.185e+02, percent-clipped=0.0 2023-09-30 04:16:40,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:40,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=593360.0, ans=0.1 2023-09-30 04:16:43,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:16:43,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:16:43,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:16:43,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:16:43,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:16:46,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:16:47,871 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 04:16:49,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:16:51,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:16:54,382 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 04:16:55,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:16:55,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:16:57,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=593426.6666666666, ans=0.125 2023-09-30 04:17:04,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 04:17:05,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:17:08,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:17:09,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=593493.3333333334, ans=0.1 2023-09-30 04:17:10,185 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 04:17:10,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:17:10,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 04:17:11,670 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=12.0 2023-09-30 04:17:12,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:17:12,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:14,200 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:17:15,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:17:16,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:17:16,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:17:16,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:17:18,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 04:17:18,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:22,300 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 04:17:26,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:17:30,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:17:32,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:17:32,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=593560.0, ans=0.0 2023-09-30 04:17:34,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:35,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:17:35,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:17:39,926 INFO [train.py:1039] (0/4) Epoch 17, batch 4050, loss[loss=0.1818, simple_loss=0.2689, pruned_loss=0.04738, over 24404.00 frames. ], tot_loss[loss=0.183, simple_loss=0.257, pruned_loss=0.05451, over 4698905.77 frames. ], batch size: 77, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:17:41,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:44,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:17:45,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 04:17:48,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:17:48,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:17:49,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:17:51,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:17:52,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:56,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:59,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:18:00,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:18:02,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:18:03,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:18:07,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:09,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:18:12,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 04:18:13,304 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=593760.0, ans=0.0 2023-09-30 04:18:14,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 04:18:14,556 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 04:18:17,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:18:18,473 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.85 vs. limit=22.5 2023-09-30 04:18:21,585 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:18:22,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 04:18:25,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:26,510 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.24 vs. limit=22.5 2023-09-30 04:18:27,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:29,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=593826.6666666666, ans=0.0 2023-09-30 04:18:32,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:32,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:18:32,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:35,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:18:38,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 04:18:38,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:18:42,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:43,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 04:18:49,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:55,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 04:18:57,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:57,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:18:59,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=593893.3333333334, ans=0.0 2023-09-30 04:19:00,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 04:19:00,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 04:19:00,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:01,689 INFO [train.py:1039] (0/4) Epoch 17, batch 4100, loss[loss=0.1849, simple_loss=0.2678, pruned_loss=0.05104, over 24522.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.258, pruned_loss=0.05527, over 4689997.87 frames. ], batch size: 66, lr: 6.02e-03, grad_scale: 32.0 2023-09-30 04:19:01,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:03,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:03,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:19:09,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 04:19:09,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 04:19:11,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 04:19:11,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 04:19:11,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:13,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:19:15,211 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 04:19:20,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:20,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:19:20,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:21,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:19:24,773 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.800e+02 1.954e+02 2.140e+02 3.094e+02, threshold=3.909e+02, percent-clipped=0.0 2023-09-30 04:19:25,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:19:26,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:26,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:19:29,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 04:19:30,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:30,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:19:30,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:32,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:19:32,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 04:19:35,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:19:35,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 04:19:38,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:19:40,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:40,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 04:19:41,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:43,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:19:43,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:19:45,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=594093.3333333334, ans=0.1 2023-09-30 04:19:46,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 04:19:49,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:19:51,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:19:53,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 04:19:55,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:55,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:19:58,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:01,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:06,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:07,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:20:13,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:13,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:16,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:20,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:20:23,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:20:25,115 INFO [train.py:1039] (0/4) Epoch 17, batch 4150, loss[loss=0.1731, simple_loss=0.2614, pruned_loss=0.04239, over 24413.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2585, pruned_loss=0.05509, over 4702453.58 frames. ], batch size: 69, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:20:25,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:20:25,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:20:25,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:29,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 04:20:29,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:30,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 04:20:30,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 04:20:30,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 04:20:32,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=594293.3333333334, ans=0.5 2023-09-30 04:20:33,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:36,268 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=594293.3333333334, ans=0.1 2023-09-30 04:20:37,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:20:37,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:41,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:20:41,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:20:42,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:20:42,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:20:44,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:46,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:20:50,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:55,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:20:57,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 04:20:59,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 04:20:59,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:21:01,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 04:21:01,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:21:01,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:04,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:05,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:10,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 04:21:13,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:16,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:21:17,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 04:21:17,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:21:20,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 04:21:20,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:21:22,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:25,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:26,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 04:21:26,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:21:26,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:21:28,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:21:30,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 04:21:30,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:30,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:21:32,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:21:34,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 04:21:34,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:34,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:21:35,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:21:37,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:37,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 04:21:38,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:45,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:21:47,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 04:21:48,365 INFO [train.py:1039] (0/4) Epoch 17, batch 4200, loss[loss=0.1888, simple_loss=0.2644, pruned_loss=0.05655, over 23269.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2566, pruned_loss=0.0545, over 4706955.75 frames. ], batch size: 93, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:21:48,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:21:52,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:21:53,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:21:55,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:55,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:57,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 04:22:00,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 04:22:00,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:03,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:08,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:22:10,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=594693.3333333334, ans=0.1 2023-09-30 04:22:12,240 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.960e+02 2.323e+02 2.795e+02 4.279e+02, threshold=4.646e+02, percent-clipped=2.0 2023-09-30 04:22:12,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:22:15,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:15,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:16,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 04:22:16,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:18,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:18,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:22:18,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:22:20,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:22:23,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 04:22:23,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:24,426 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=594760.0, ans=0.125 2023-09-30 04:22:26,441 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=594760.0, ans=0.125 2023-09-30 04:22:27,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:22:29,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:22:29,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:22:32,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:22:34,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:22:34,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 04:22:35,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:22:35,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:22:41,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:22:44,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:44,260 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=594826.6666666666, ans=0.1 2023-09-30 04:22:49,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:22:52,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 04:22:54,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:23:00,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:23:01,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:03,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 04:23:10,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:23:12,322 INFO [train.py:1039] (0/4) Epoch 17, batch 4250, loss[loss=0.1605, simple_loss=0.2425, pruned_loss=0.0392, over 24501.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2554, pruned_loss=0.05413, over 4718947.28 frames. ], batch size: 66, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:23:15,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:23:15,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:23:17,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:23,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:23:25,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 04:23:25,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:23:28,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:32,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:33,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=595026.6666666666, ans=0.125 2023-09-30 04:23:36,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:36,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:40,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:23:40,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:23:41,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:41,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:43,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:46,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:23:47,208 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:23:47,801 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.47 vs. limit=15.0 2023-09-30 04:23:48,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:23:50,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 04:23:52,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=595093.3333333334, ans=0.07 2023-09-30 04:23:53,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 04:23:53,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:53,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:53,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:54,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:23:54,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:55,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:58,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:24:00,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:24:05,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:07,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:07,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 04:24:07,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:24:09,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 04:24:11,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:24:12,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:24:13,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=595160.0, ans=0.2 2023-09-30 04:24:14,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:15,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:24:17,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 04:24:19,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:24:19,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:24:24,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=595226.6666666666, ans=10.0 2023-09-30 04:24:24,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=595226.6666666666, ans=0.125 2023-09-30 04:24:25,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:28,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:30,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:24:30,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:31,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:33,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:24:34,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:24:34,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 04:24:35,513 INFO [train.py:1039] (0/4) Epoch 17, batch 4300, loss[loss=0.1941, simple_loss=0.2585, pruned_loss=0.06488, over 23806.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2546, pruned_loss=0.0539, over 4703708.35 frames. ], batch size: 164, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:24:35,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:40,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:40,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:24:41,279 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=595293.3333333334, ans=0.2 2023-09-30 04:24:43,472 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=595293.3333333334, ans=0.125 2023-09-30 04:24:46,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:55,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:55,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 04:24:57,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:24:59,976 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.855e+02 2.115e+02 2.472e+02 4.142e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 04:25:00,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:25:00,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:25:00,157 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 04:25:03,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:25:04,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:08,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 04:25:08,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:25:09,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 04:25:11,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:25:15,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:25:18,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:25:18,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:25:19,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:25:21,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:23,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:25:23,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 04:25:24,092 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.88 vs. limit=15.0 2023-09-30 04:25:24,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 04:25:26,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:25:29,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:29,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:25:29,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:29,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:31,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 04:25:31,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 04:25:33,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 04:25:34,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:25:34,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 04:25:36,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 04:25:39,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:41,050 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 04:25:41,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:25:44,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:44,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:46,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 04:25:47,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:47,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:49,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:25:49,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:51,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:25:53,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:25:53,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:54,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:54,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:55,190 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=595560.0, ans=0.125 2023-09-30 04:25:57,819 INFO [train.py:1039] (0/4) Epoch 17, batch 4350, loss[loss=0.1936, simple_loss=0.2601, pruned_loss=0.06354, over 23532.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2552, pruned_loss=0.0539, over 4713070.88 frames. ], batch size: 120, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:26:00,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 04:26:01,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:26:06,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:09,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:11,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:26:11,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:26:16,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:26:20,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:24,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:26:24,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:26:27,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:26:29,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=595760.0, ans=0.5 2023-09-30 04:26:30,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:26:32,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:26:35,107 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=15.0 2023-09-30 04:26:37,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 04:26:39,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:41,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:42,106 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.19 vs. limit=12.0 2023-09-30 04:26:44,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:47,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 04:26:51,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:26:52,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:26:59,286 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 04:27:00,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:00,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:27:03,800 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 04:27:03,927 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 04:27:03,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:03,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:05,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:27:05,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:07,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:07,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:10,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 04:27:10,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:10,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:10,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:11,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 04:27:13,940 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 04:27:13,947 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 04:27:13,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 04:27:18,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:27:19,862 INFO [train.py:1039] (0/4) Epoch 17, batch 4400, loss[loss=0.1649, simple_loss=0.2413, pruned_loss=0.04421, over 24414.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2557, pruned_loss=0.0538, over 4717300.59 frames. ], batch size: 58, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:27:19,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:27:19,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:21,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:27:23,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 04:27:24,633 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 04:27:24,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:28,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:28,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:30,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:31,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 04:27:33,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 04:27:33,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 04:27:33,850 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 04:27:35,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:27:35,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:38,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 04:27:40,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:41,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:41,842 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 04:27:44,837 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.901e+02 2.144e+02 2.567e+02 3.604e+02, threshold=4.289e+02, percent-clipped=0.0 2023-09-30 04:27:44,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:45,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 04:27:46,409 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 04:27:46,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=596026.6666666666, ans=0.1 2023-09-30 04:27:48,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 04:27:48,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 04:27:51,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 04:27:51,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:52,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:54,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 04:27:54,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 04:27:54,731 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=596093.3333333334, ans=0.0 2023-09-30 04:27:55,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:57,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:27:57,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:59,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:01,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:28:01,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 04:28:01,192 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 04:28:02,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=596093.3333333334, ans=0.125 2023-09-30 04:28:02,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=596093.3333333334, ans=0.125 2023-09-30 04:28:05,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:11,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=596160.0, ans=0.125 2023-09-30 04:28:13,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:28:14,197 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=596160.0, ans=0.125 2023-09-30 04:28:14,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=596160.0, ans=0.125 2023-09-30 04:28:16,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 04:28:20,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:28:23,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:27,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:28:27,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 04:28:27,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:28:27,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:28:27,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:28:28,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:28:32,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 04:28:32,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=596226.6666666666, ans=0.1 2023-09-30 04:28:35,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 04:28:35,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 04:28:36,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:28:37,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 04:28:38,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:28:40,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:28:44,018 INFO [train.py:1039] (0/4) Epoch 17, batch 4450, loss[loss=0.1862, simple_loss=0.2688, pruned_loss=0.05175, over 24316.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2569, pruned_loss=0.05419, over 4721928.16 frames. ], batch size: 74, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:28:44,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 04:28:47,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:48,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:48,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:28:55,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:28:57,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:29:00,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:03,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:29:03,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:29:03,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:07,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 04:29:07,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:07,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:07,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:07,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:29:07,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=596360.0, ans=0.125 2023-09-30 04:29:07,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=596360.0, ans=0.0 2023-09-30 04:29:10,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:29:16,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:16,840 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=596426.6666666666, ans=0.0 2023-09-30 04:29:18,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:18,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:29:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:29:22,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=596426.6666666666, ans=0.0 2023-09-30 04:29:24,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 04:29:24,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 04:29:24,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:29:29,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:30,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 04:29:34,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:29:40,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:41,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 04:29:41,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:41,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:41,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:29:41,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:43,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:46,038 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=596493.3333333334, ans=0.125 2023-09-30 04:29:47,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:29:47,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 04:29:48,947 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=596493.3333333334, ans=0.125 2023-09-30 04:29:50,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:29:51,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:53,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:54,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:56,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:29:59,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:30:01,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 04:30:02,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:30:07,629 INFO [train.py:1039] (0/4) Epoch 17, batch 4500, loss[loss=0.1735, simple_loss=0.2552, pruned_loss=0.0459, over 24523.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2579, pruned_loss=0.05408, over 4725431.50 frames. ], batch size: 63, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:30:07,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:09,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 04:30:09,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 04:30:11,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:15,144 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=596626.6666666666, ans=0.0 2023-09-30 04:30:19,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:30:19,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:21,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:30:22,181 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.75 vs. limit=12.0 2023-09-30 04:30:23,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:30:23,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:23,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:29,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=596693.3333333334, ans=0.125 2023-09-30 04:30:32,271 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.892e+02 2.101e+02 2.322e+02 3.249e+02, threshold=4.203e+02, percent-clipped=0.0 2023-09-30 04:30:35,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:35,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:30:38,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:30:40,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:30:41,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:30:49,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:30:52,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:30:56,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:31:00,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:31:01,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 04:31:01,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:03,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:04,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:06,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:31:07,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:31:07,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 04:31:07,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:31:07,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:13,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:31:14,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:31:14,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=596893.3333333334, ans=0.0 2023-09-30 04:31:18,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:22,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:31:22,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:31:25,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 04:31:25,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 04:31:25,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 04:31:30,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 04:31:31,541 INFO [train.py:1039] (0/4) Epoch 17, batch 4550, loss[loss=0.1993, simple_loss=0.2578, pruned_loss=0.07036, over 23879.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2568, pruned_loss=0.05421, over 4727449.47 frames. ], batch size: 195, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:31:31,916 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=596960.0, ans=0.2 2023-09-30 04:31:34,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 04:31:35,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:38,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:39,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:43,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:43,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=596960.0, ans=0.0 2023-09-30 04:31:48,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:31:49,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:51,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:31:51,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:31:51,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:53,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:53,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:56,058 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.95 vs. limit=22.5 2023-09-30 04:31:57,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:31:57,662 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=597026.6666666666, ans=0.2 2023-09-30 04:31:59,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=597026.6666666666, ans=0.125 2023-09-30 04:32:00,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 04:32:01,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 04:32:02,643 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=597093.3333333334, ans=0.1 2023-09-30 04:32:03,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:32:04,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 04:32:07,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 04:32:08,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:11,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 04:32:13,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:32:14,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:32:18,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 04:32:20,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=597160.0, ans=0.125 2023-09-30 04:32:21,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:25,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:25,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:26,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:26,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 04:32:28,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 04:32:28,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:32:30,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 04:32:31,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 04:32:31,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:35,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:35,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:32:37,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:37,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:32:39,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:32:40,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 04:32:41,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=597226.6666666666, ans=0.2 2023-09-30 04:32:42,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:43,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:32:43,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 04:32:43,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:32:43,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 04:32:46,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:32:47,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:32:50,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:32:50,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:50,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:32:52,993 INFO [train.py:1039] (0/4) Epoch 17, batch 4600, loss[loss=0.1681, simple_loss=0.2556, pruned_loss=0.04027, over 24470.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2552, pruned_loss=0.05374, over 4706182.78 frames. ], batch size: 63, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:32:53,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:32:54,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:32:57,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:59,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:33:03,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:33:03,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:33:05,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:05,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 04:33:07,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=597293.3333333334, ans=0.09899494936611666 2023-09-30 04:33:08,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:33:13,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:33:13,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:17,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:18,330 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.818e+02 2.016e+02 2.179e+02 3.781e+02, threshold=4.032e+02, percent-clipped=0.0 2023-09-30 04:33:23,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 04:33:23,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:26,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:29,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:33:30,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:33,362 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.91 vs. limit=22.5 2023-09-30 04:33:34,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=597426.6666666666, ans=0.1 2023-09-30 04:33:36,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 04:33:36,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:33:37,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:33:42,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:44,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:33:46,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:33:50,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 04:33:51,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:33:54,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=597493.3333333334, ans=0.0 2023-09-30 04:33:56,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:33:57,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:00,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 04:34:00,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:00,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 04:34:00,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:03,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:03,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:34:04,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:05,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 04:34:06,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 04:34:06,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 04:34:06,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:09,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:09,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:10,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:15,489 INFO [train.py:1039] (0/4) Epoch 17, batch 4650, loss[loss=0.1675, simple_loss=0.2536, pruned_loss=0.04074, over 24639.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2547, pruned_loss=0.05323, over 4712415.32 frames. ], batch size: 68, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:34:16,109 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:34:20,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:34:26,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:26,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:27,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:34:27,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:27,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:27,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:32,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 04:34:35,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:34:37,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=597693.3333333334, ans=0.04949747468305833 2023-09-30 04:34:38,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 04:34:38,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:40,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 04:34:40,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:34:40,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 04:34:42,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 04:34:42,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:43,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:34:45,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:34:47,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:47,296 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 04:34:50,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:52,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 04:34:56,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:56,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:34:56,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 04:34:57,023 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.38 vs. limit=15.0 2023-09-30 04:34:59,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:01,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:35:03,355 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=597760.0, ans=0.2 2023-09-30 04:35:05,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:06,247 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:35:10,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:13,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:15,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:15,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:35:17,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=597826.6666666666, ans=0.1 2023-09-30 04:35:18,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 04:35:18,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 04:35:20,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 04:35:20,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 04:35:20,623 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=597893.3333333334, ans=0.125 2023-09-30 04:35:22,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:30,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:35:30,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:31,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 04:35:31,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:32,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:32,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:35:34,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:35:35,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:35:35,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:36,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:39,591 INFO [train.py:1039] (0/4) Epoch 17, batch 4700, loss[loss=0.205, simple_loss=0.2565, pruned_loss=0.07678, over 19350.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2551, pruned_loss=0.05315, over 4721818.75 frames. ], batch size: 388, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:35:39,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:39,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:35:41,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:35:42,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 04:35:44,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:35:44,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 04:35:52,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:55,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:55,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:56,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:59,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:35:59,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=598026.6666666666, ans=0.125 2023-09-30 04:36:03,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 04:36:03,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 04:36:04,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.407e+02 1.853e+02 2.008e+02 2.274e+02 4.210e+02, threshold=4.016e+02, percent-clipped=1.0 2023-09-30 04:36:05,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=598026.6666666666, ans=0.1 2023-09-30 04:36:06,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:08,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:36:08,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:36:13,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:15,198 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.03 vs. limit=12.0 2023-09-30 04:36:17,000 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.31 vs. limit=15.0 2023-09-30 04:36:17,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:36:19,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:36:22,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:36:22,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=598093.3333333334, ans=0.125 2023-09-30 04:36:32,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 04:36:33,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:36:34,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=598160.0, ans=0.07 2023-09-30 04:36:35,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:38,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 04:36:38,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:36:43,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:36:43,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 04:36:46,190 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.67 vs. limit=15.0 2023-09-30 04:36:46,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:46,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:47,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=598226.6666666666, ans=0.0 2023-09-30 04:36:49,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=598226.6666666666, ans=0.125 2023-09-30 04:36:50,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:50,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:36:51,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 04:36:52,036 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 04:36:54,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:57,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 04:36:57,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:37:00,900 INFO [train.py:1039] (0/4) Epoch 17, batch 4750, loss[loss=0.1898, simple_loss=0.2733, pruned_loss=0.05309, over 24331.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2564, pruned_loss=0.05354, over 4725850.34 frames. ], batch size: 74, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:37:03,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 04:37:06,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:37:06,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:12,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:12,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:37:13,154 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=598293.3333333334, ans=0.025 2023-09-30 04:37:15,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 04:37:16,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:19,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 04:37:21,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:37:21,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:37:22,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:24,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=598360.0, ans=0.0 2023-09-30 04:37:26,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 04:37:32,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:37:34,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 04:37:34,988 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.20 vs. limit=22.5 2023-09-30 04:37:36,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:39,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:39,854 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 04:37:39,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 04:37:46,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 04:37:48,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:51,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:37:54,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:37:54,704 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 04:37:54,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:37:54,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=598493.3333333334, ans=0.125 2023-09-30 04:37:56,447 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=598493.3333333334, ans=0.1 2023-09-30 04:37:57,071 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.02 vs. limit=22.5 2023-09-30 04:37:59,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:38:01,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:38:03,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 04:38:03,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 04:38:03,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:04,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:38:04,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:06,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:38:06,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 04:38:09,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 04:38:11,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:15,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:38:15,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 04:38:16,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:16,881 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.32 vs. limit=12.0 2023-09-30 04:38:19,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:21,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:38:21,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:22,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:38:24,072 INFO [train.py:1039] (0/4) Epoch 17, batch 4800, loss[loss=0.1603, simple_loss=0.2392, pruned_loss=0.04067, over 24437.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.257, pruned_loss=0.05372, over 4730092.30 frames. ], batch size: 58, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:38:26,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:26,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 04:38:26,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 04:38:28,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 04:38:31,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:38:32,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:34,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 04:38:35,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=598626.6666666666, ans=0.125 2023-09-30 04:38:39,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:39,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:44,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:38:47,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:47,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:47,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 04:38:49,112 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.366e+02 1.827e+02 2.030e+02 2.375e+02 4.462e+02, threshold=4.061e+02, percent-clipped=1.0 2023-09-30 04:38:49,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:49,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:38:50,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:38:54,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:56,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:56,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:38:57,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:57,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:38:57,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:59,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:02,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:05,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:39:08,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:39:09,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:11,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 04:39:11,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 04:39:12,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:12,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:39:12,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:39:12,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:12,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:39:16,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:39:16,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:19,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:23,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:23,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:28,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 04:39:29,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:29,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:29,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:39:30,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:33,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:35,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:39:35,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:36,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:39:36,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:39:37,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:39:39,867 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.16 vs. limit=15.0 2023-09-30 04:39:42,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:43,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:43,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:45,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 04:39:48,820 INFO [train.py:1039] (0/4) Epoch 17, batch 4850, loss[loss=0.1665, simple_loss=0.2362, pruned_loss=0.04841, over 13594.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2582, pruned_loss=0.05453, over 4702670.08 frames. ], batch size: 29, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:39:48,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 04:39:48,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:48,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:49,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:39:49,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:52,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:54,522 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.22 vs. limit=15.0 2023-09-30 04:39:56,250 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=598960.0, ans=0.0 2023-09-30 04:40:01,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 04:40:03,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:09,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:11,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:40:11,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:12,128 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.49 vs. limit=22.5 2023-09-30 04:40:13,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=599026.6666666666, ans=0.0 2023-09-30 04:40:14,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:16,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:40:17,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:40:17,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 04:40:21,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=599093.3333333334, ans=0.2 2023-09-30 04:40:23,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:40:24,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:40:26,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:40:26,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:40:26,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 04:40:27,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=599093.3333333334, ans=0.1 2023-09-30 04:40:29,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:29,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:33,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:33,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 04:40:33,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 04:40:34,193 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.48 vs. limit=22.5 2023-09-30 04:40:35,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:40:42,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:40:42,937 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.02 vs. limit=15.0 2023-09-30 04:40:43,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 04:40:45,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:40:45,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:40:48,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:40:48,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 04:40:48,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:49,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 04:40:49,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:51,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:40:53,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 04:41:03,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:05,265 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:41:07,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:41:08,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:11,685 INFO [train.py:1039] (0/4) Epoch 17, batch 4900, loss[loss=0.17, simple_loss=0.2529, pruned_loss=0.04359, over 24536.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2565, pruned_loss=0.05377, over 4705097.99 frames. ], batch size: 71, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:41:15,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 04:41:15,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:41:18,572 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=599293.3333333334, ans=0.2 2023-09-30 04:41:19,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:21,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:21,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:41:24,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 04:41:24,996 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:41:30,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 04:41:35,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 04:41:36,397 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.955e+02 2.270e+02 2.616e+02 3.974e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-30 04:41:36,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 04:41:36,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:36,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:36,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:41:36,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:36,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:41:38,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 04:41:43,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 04:41:43,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:41:44,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:41:45,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:46,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:41:48,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:48,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:48,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 04:41:50,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:41:51,007 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=599426.6666666666, ans=0.0 2023-09-30 04:41:52,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:52,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 04:41:52,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 04:41:56,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 04:41:58,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:41:59,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:41:59,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:42:01,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:02,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:42:02,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:42:02,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=599493.3333333334, ans=0.0 2023-09-30 04:42:03,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 04:42:05,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:05,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=599493.3333333334, ans=0.0 2023-09-30 04:42:07,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:42:09,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:42:13,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=599493.3333333334, ans=0.125 2023-09-30 04:42:15,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 04:42:15,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:42:15,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 04:42:16,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 04:42:21,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:23,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:42:25,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 04:42:25,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:26,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:42:27,183 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.57 vs. limit=15.0 2023-09-30 04:42:28,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:31,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:31,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:42:31,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:32,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:42:33,067 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=599626.6666666666, ans=0.0 2023-09-30 04:42:34,764 INFO [train.py:1039] (0/4) Epoch 17, batch 4950, loss[loss=0.1823, simple_loss=0.248, pruned_loss=0.05827, over 23253.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2549, pruned_loss=0.05341, over 4709724.36 frames. ], batch size: 105, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:42:34,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:42:39,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:39,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:42,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 04:42:43,169 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=599626.6666666666, ans=0.2 2023-09-30 04:42:44,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 04:42:44,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:42:45,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 04:42:45,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:45,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:42:47,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:42:47,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:42:49,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:49,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:42:51,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:42:52,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:55,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:55,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:59,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:43:05,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:06,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:43:08,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:08,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:08,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=599760.0, ans=0.2 2023-09-30 04:43:12,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:43:12,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 04:43:12,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 04:43:15,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:18,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:43:18,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:43:20,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:43:21,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:43:21,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:43:24,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:26,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:43:27,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=599826.6666666666, ans=0.0 2023-09-30 04:43:28,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:43:31,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:31,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:31,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 04:43:33,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:43:33,785 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.35 vs. limit=22.5 2023-09-30 04:43:35,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:43:38,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:43:39,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:43:39,951 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:43:40,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:41,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:43:41,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:43:45,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:43:45,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:43:45,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:46,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 04:43:51,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:43:55,628 INFO [train.py:1039] (0/4) Epoch 17, batch 5000, loss[loss=0.1945, simple_loss=0.2633, pruned_loss=0.06283, over 23792.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2543, pruned_loss=0.05337, over 4709353.50 frames. ], batch size: 212, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:43:57,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 04:43:57,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:44:05,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:05,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:06,758 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.06 vs. limit=15.0 2023-09-30 04:44:07,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 04:44:07,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 04:44:09,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:44:11,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 04:44:11,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:44:11,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:44:12,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 04:44:14,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:16,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:16,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 04:44:17,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:17,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:18,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=600026.6666666666, ans=22.5 2023-09-30 04:44:20,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 04:44:20,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 04:44:20,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:44:22,124 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.867e+02 2.091e+02 2.376e+02 3.821e+02, threshold=4.182e+02, percent-clipped=0.0 2023-09-30 04:44:22,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 04:44:22,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:44:23,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:23,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:44:23,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 04:44:23,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 04:44:26,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 04:44:27,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:27,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:29,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 04:44:29,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:32,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:34,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:34,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:44:36,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 04:44:36,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:44:39,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:44:42,445 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 04:44:44,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:46,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:46,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:44:49,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 04:44:49,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:51,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:51,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:53,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 04:44:53,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:56,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:58,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:03,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 04:45:03,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=600226.6666666666, ans=0.125 2023-09-30 04:45:06,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:11,862 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=600226.6666666666, ans=0.125 2023-09-30 04:45:18,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:45:18,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=600293.3333333334, ans=0.125 2023-09-30 04:45:19,650 INFO [train.py:1039] (0/4) Epoch 17, batch 5050, loss[loss=0.1752, simple_loss=0.245, pruned_loss=0.0527, over 24322.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2551, pruned_loss=0.05394, over 4699743.91 frames. ], batch size: 56, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:45:19,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:19,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:45:19,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:19,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:45:19,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:45:19,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 04:45:23,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=600293.3333333334, ans=0.125 2023-09-30 04:45:25,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:45:28,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:29,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:45:29,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 04:45:31,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:31,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:45:35,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:45:36,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:45:36,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:45:46,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 04:45:46,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:45:48,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:45:48,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 04:45:48,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:45:50,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:50,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:50,617 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=600360.0, ans=0.0 2023-09-30 04:45:51,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:45:51,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 04:45:51,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 04:45:53,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:55,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:00,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:46:00,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 04:46:04,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:05,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 04:46:07,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:46:07,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:46:07,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:08,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:46:11,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:13,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:46:15,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:15,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:46:15,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:46:15,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 04:46:15,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:46:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:46:20,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:20,273 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 04:46:20,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:46:22,491 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=600493.3333333334, ans=0.04949747468305833 2023-09-30 04:46:23,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:25,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:25,343 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 04:46:28,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=600560.0, ans=0.1 2023-09-30 04:46:29,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:29,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 04:46:29,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:35,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:35,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:35,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 04:46:37,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 04:46:38,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:39,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=600560.0, ans=0.125 2023-09-30 04:46:39,231 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=600560.0, ans=0.0 2023-09-30 04:46:40,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:46:40,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:46:42,024 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 04:46:43,276 INFO [train.py:1039] (0/4) Epoch 17, batch 5100, loss[loss=0.1967, simple_loss=0.2655, pruned_loss=0.06393, over 23815.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2565, pruned_loss=0.05468, over 4700426.35 frames. ], batch size: 179, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:46:44,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:48,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 04:46:49,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 04:46:51,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:52,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:53,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=600626.6666666666, ans=0.125 2023-09-30 04:46:53,342 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.99 vs. limit=10.0 2023-09-30 04:46:55,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:55,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 04:46:55,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 04:47:02,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:47:04,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:47:08,745 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.763e+02 1.981e+02 2.119e+02 3.450e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 04:47:09,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:47:12,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 04:47:14,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:16,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:47:16,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:47:19,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 04:47:22,846 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 04:47:22,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:24,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 04:47:24,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 04:47:24,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=600760.0, ans=0.5 2023-09-30 04:47:26,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:37,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:47:39,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 04:47:39,550 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 04:47:39,562 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 04:47:39,690 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=600826.6666666666, ans=0.125 2023-09-30 04:47:41,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 04:47:41,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:44,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 04:47:50,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 04:47:53,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:47:54,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:47:57,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 04:47:59,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:47:59,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 04:48:02,471 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.96 vs. limit=6.0 2023-09-30 04:48:04,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=600960.0, ans=0.125 2023-09-30 04:48:05,690 INFO [train.py:1039] (0/4) Epoch 17, batch 5150, loss[loss=0.1656, simple_loss=0.2383, pruned_loss=0.04646, over 24403.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2571, pruned_loss=0.05486, over 4716329.44 frames. ], batch size: 58, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:48:05,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:48:05,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:48:05,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:48:05,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:48:07,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:48:07,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:48:08,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 04:48:08,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 04:48:09,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 04:48:09,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:48:09,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 04:48:11,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:12,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 04:48:12,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:14,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:20,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:48:20,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 04:48:22,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:22,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:48:26,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:48:26,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:26,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:26,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:48:26,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:48:27,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 04:48:27,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:48:29,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:48:31,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:48:31,987 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.30 vs. limit=6.0 2023-09-30 04:48:32,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 04:48:34,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:48:40,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:48:42,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 04:48:46,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:53,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:49:00,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:00,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:02,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=601160.0, ans=22.5 2023-09-30 04:49:03,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 04:49:06,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:49:06,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:49:08,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:49:10,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=601226.6666666666, ans=0.125 2023-09-30 04:49:11,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:11,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:13,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 04:49:18,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:49:20,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:49:20,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=601226.6666666666, ans=0.125 2023-09-30 04:49:23,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:49:23,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:49:23,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:49:24,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:49:24,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:49:25,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:49:27,923 INFO [train.py:1039] (0/4) Epoch 17, batch 5200, loss[loss=0.1663, simple_loss=0.2463, pruned_loss=0.04318, over 24329.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2579, pruned_loss=0.05484, over 4712703.20 frames. ], batch size: 61, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:49:29,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:49:31,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:49:35,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:40,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 04:49:42,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:49:44,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:45,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:47,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:49:47,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:50,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 04:49:53,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:49:53,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:55,062 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.818e+02 1.970e+02 2.155e+02 3.146e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 04:49:55,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 04:49:57,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:49:57,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=601360.0, ans=0.125 2023-09-30 04:49:59,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:49:59,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 04:50:00,723 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 04:50:03,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 04:50:03,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:03,813 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 04:50:05,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:50:06,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:07,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:50:08,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 04:50:09,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:50:11,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:14,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 04:50:14,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 04:50:14,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 04:50:19,667 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:50:20,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 04:50:21,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:50:27,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:50:27,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:28,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 04:50:30,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:30,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 04:50:30,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:30,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:50:35,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:36,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:50:40,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:41,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:50:41,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:46,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:48,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 04:50:50,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:50,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:50:50,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:51,672 INFO [train.py:1039] (0/4) Epoch 17, batch 5250, loss[loss=0.1869, simple_loss=0.2496, pruned_loss=0.06208, over 23724.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2564, pruned_loss=0.05389, over 4711410.30 frames. ], batch size: 232, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:50:51,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:50:53,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:50:56,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:50:57,746 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.92 vs. limit=22.5 2023-09-30 04:51:00,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:00,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:51:01,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:51:07,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:51:07,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:51:11,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:51:13,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:51:15,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 04:51:15,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:17,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=601693.3333333334, ans=0.2 2023-09-30 04:51:18,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:51:38,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=601826.6666666666, ans=0.2 2023-09-30 04:51:44,260 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=601826.6666666666, ans=0.125 2023-09-30 04:52:06,485 INFO [train.py:1039] (0/4) Epoch 17, batch 5300, loss[loss=0.1743, simple_loss=0.2192, pruned_loss=0.06472, over 19263.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2558, pruned_loss=0.05368, over 4698125.35 frames. ], batch size: 388, lr: 5.98e-03, grad_scale: 16.0 2023-09-30 04:52:17,996 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=601960.0, ans=0.125 2023-09-30 04:52:20,933 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.24 vs. limit=10.0 2023-09-30 04:52:21,661 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-17.pt 2023-09-30 04:52:27,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:52:27,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 04:52:27,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 04:52:27,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:28,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:28,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:28,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:28,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:28,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:52:28,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:28,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:52:29,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:52:29,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 04:52:29,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 04:52:29,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 04:52:29,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:52:29,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 04:52:29,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 04:52:30,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:30,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:30,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:30,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:30,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:52:31,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:31,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:31,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:32,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:32,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:32,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:52:32,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:32,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:52:33,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 04:52:33,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:33,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:33,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 04:52:33,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 04:52:33,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:52:33,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:52:33,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 04:52:34,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 04:52:34,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:34,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:52:35,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:35,406 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 04:52:35,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 04:52:35,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:52:35,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:35,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 04:52:36,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 04:52:36,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 04:52:36,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:39,885 INFO [train.py:1039] (0/4) Epoch 18, batch 0, loss[loss=0.1872, simple_loss=0.2721, pruned_loss=0.05118, over 24364.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2721, pruned_loss=0.05118, over 24364.00 frames. ], batch size: 77, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:52:39,886 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 04:52:53,312 INFO [train.py:1071] (0/4) Epoch 18, validation: loss=0.3168, simple_loss=0.2865, pruned_loss=0.1735, over 1125622.00 frames. 2023-09-30 04:52:53,313 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 04:52:57,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 04:52:58,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:52:58,925 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=602040.0, ans=0.125 2023-09-30 04:53:00,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:53:01,507 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.872e+02 2.065e+02 2.362e+02 3.138e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-30 04:53:06,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:06,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:53:06,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:08,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 04:53:09,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 04:53:11,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:13,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:16,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:53:16,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:19,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 04:53:19,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:29,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:53:29,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:31,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 04:53:34,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:53:34,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:53:36,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:41,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:53:44,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:51,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 04:53:52,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=602240.0, ans=0.0 2023-09-30 04:53:54,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 04:53:55,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:53:55,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:53:57,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:53:59,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:01,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 04:54:03,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:05,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:10,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:12,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=602306.6666666666, ans=0.09899494936611666 2023-09-30 04:54:12,401 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=602306.6666666666, ans=0.0 2023-09-30 04:54:13,558 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 04:54:15,544 INFO [train.py:1039] (0/4) Epoch 18, batch 50, loss[loss=0.1974, simple_loss=0.2715, pruned_loss=0.06164, over 23889.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2572, pruned_loss=0.05439, over 1062266.26 frames. ], batch size: 195, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:54:17,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:54:20,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:20,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:54:21,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 04:54:21,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:54:21,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:54:23,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:24,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:26,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:29,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 04:54:29,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:37,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:54:37,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 04:54:41,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 04:54:43,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:54:44,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:54:44,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:46,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:48,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:54:48,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:54:48,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:48,512 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=602506.6666666666, ans=0.0 2023-09-30 04:54:57,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:54:57,619 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=602506.6666666666, ans=0.125 2023-09-30 04:54:58,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:58,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:55:00,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 04:55:03,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:55:03,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:55:03,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 04:55:03,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:05,343 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=602573.3333333334, ans=0.125 2023-09-30 04:55:06,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 04:55:13,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:15,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:55:15,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:16,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:18,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:21,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 04:55:21,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 04:55:21,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:21,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:25,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:55:25,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:26,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 04:55:28,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 04:55:28,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:55:28,432 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=602640.0, ans=0.2 2023-09-30 04:55:29,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:31,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:55:31,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 04:55:31,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 04:55:32,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:34,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:35,735 INFO [train.py:1039] (0/4) Epoch 18, batch 100, loss[loss=0.175, simple_loss=0.2637, pruned_loss=0.04312, over 24545.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2583, pruned_loss=0.05401, over 1884350.79 frames. ], batch size: 71, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:55:35,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:55:35,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:55:38,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:55:42,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:55:43,856 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:55:44,968 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.863e+02 2.072e+02 2.465e+02 3.411e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-30 04:55:45,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:55:47,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 04:55:47,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:51,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:55:51,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:51,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:51,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:51,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:52,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 04:55:54,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:55:55,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:55,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:55,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:55:56,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=602773.3333333334, ans=0.1 2023-09-30 04:56:00,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 04:56:01,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:02,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:04,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:56:05,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:56:06,481 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.35 vs. limit=10.0 2023-09-30 04:56:07,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=602840.0, ans=0.07 2023-09-30 04:56:10,194 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 04:56:10,219 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 04:56:11,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:11,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:56:15,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:56:16,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:17,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=602840.0, ans=0.2 2023-09-30 04:56:18,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:21,426 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.20 vs. limit=12.0 2023-09-30 04:56:24,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:25,658 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 04:56:27,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:56:30,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:56:31,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:56:35,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:38,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:40,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:56:43,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:56:46,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:47,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:49,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:49,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:56:51,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:51,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 04:56:51,451 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 04:56:51,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:53,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:56:53,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:53,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:53,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:56:55,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:56:55,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:56:55,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:56,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:56,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:57,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:56:58,449 INFO [train.py:1039] (0/4) Epoch 18, batch 150, loss[loss=0.1757, simple_loss=0.2628, pruned_loss=0.04427, over 24417.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2586, pruned_loss=0.0541, over 2525586.16 frames. ], batch size: 69, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:56:58,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:57:02,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:05,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:57:05,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:05,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:05,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=603040.0, ans=0.0 2023-09-30 04:57:07,128 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=603040.0, ans=0.0 2023-09-30 04:57:08,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:08,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:12,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:57:12,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:16,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 04:57:16,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 04:57:16,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 04:57:19,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:57:19,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:57:21,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:57:23,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:57:23,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:24,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:24,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:26,175 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 04:57:27,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:35,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:37,016 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=603173.3333333334, ans=0.0 2023-09-30 04:57:39,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:57:39,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 04:57:42,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:57:42,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:43,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:57:44,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:57:48,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:48,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=603240.0, ans=0.125 2023-09-30 04:57:49,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:57:51,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:52,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 04:57:53,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=603240.0, ans=0.2 2023-09-30 04:57:54,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=603240.0, ans=0.1 2023-09-30 04:57:58,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:58,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:57:58,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:57:59,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:58:01,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:02,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:58:05,067 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=603306.6666666666, ans=0.5 2023-09-30 04:58:06,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:58:07,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:58:09,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:11,562 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=603306.6666666666, ans=0.5 2023-09-30 04:58:12,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:58:12,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 04:58:12,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:58:12,757 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 04:58:17,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:20,318 INFO [train.py:1039] (0/4) Epoch 18, batch 200, loss[loss=0.1756, simple_loss=0.2628, pruned_loss=0.04419, over 24331.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2578, pruned_loss=0.05402, over 3020775.76 frames. ], batch size: 74, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:58:20,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:58:20,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:58:23,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 04:58:23,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:23,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:27,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 04:58:30,017 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.857e+02 2.048e+02 2.282e+02 3.617e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 04:58:30,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:58:32,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:32,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:32,902 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.92 vs. limit=15.0 2023-09-30 04:58:35,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:58:35,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:35,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:50,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=603440.0, ans=0.125 2023-09-30 04:58:56,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:58:57,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:58:58,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:58:59,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:59:00,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:59:00,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:59:01,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:03,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:59:03,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:03,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:06,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 04:59:06,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:59:06,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:11,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:59:21,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:27,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:29,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:59:34,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=603640.0, ans=0.2 2023-09-30 04:59:34,237 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=603640.0, ans=0.125 2023-09-30 04:59:35,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:35,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=603640.0, ans=0.125 2023-09-30 04:59:38,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 04:59:38,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:38,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:59:38,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:40,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:59:41,943 INFO [train.py:1039] (0/4) Epoch 18, batch 250, loss[loss=0.1924, simple_loss=0.2713, pruned_loss=0.05675, over 24028.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2581, pruned_loss=0.05377, over 3410944.91 frames. ], batch size: 80, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:59:42,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 04:59:44,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:59:44,135 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 04:59:45,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:59:47,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:52,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:59:52,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:53,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:59:56,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:05,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:09,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:00:09,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:00:18,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:00:18,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:00:20,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:00:20,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:22,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:00:22,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:00:22,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:26,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:00:29,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 05:00:29,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:31,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:00:32,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:00:32,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:00:32,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:00:35,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:00:35,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:00:37,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:38,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:00:38,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:42,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:00:47,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:50,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:56,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:58,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:01:03,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 05:01:03,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=604040.0, ans=0.125 2023-09-30 05:01:04,802 INFO [train.py:1039] (0/4) Epoch 18, batch 300, loss[loss=0.1812, simple_loss=0.2617, pruned_loss=0.05034, over 23988.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2556, pruned_loss=0.0534, over 3691421.65 frames. ], batch size: 80, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:01:04,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:04,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:01:06,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=604040.0, ans=0.125 2023-09-30 05:01:07,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 05:01:08,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:01:09,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:01:09,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 05:01:14,086 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.412e+02 1.790e+02 1.961e+02 2.231e+02 3.675e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 05:01:14,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:01:15,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:01:18,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:01:19,206 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=604106.6666666666, ans=0.125 2023-09-30 05:01:20,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 05:01:20,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:01:23,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:01:23,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 05:01:23,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:27,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:01:33,793 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-09-30 05:01:34,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:01:34,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 05:01:38,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 05:01:38,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:39,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:42,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:42,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 05:01:42,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:01:45,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:01:46,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:01:47,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:47,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=604173.3333333334, ans=0.125 2023-09-30 05:01:48,372 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-09-30 05:01:50,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:01:50,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 05:01:52,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:01:55,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:58,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 05:01:59,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:03,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:06,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:02:06,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 05:02:10,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:11,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:02:13,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:14,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:02:16,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 05:02:16,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:02:16,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:16,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 05:02:19,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:20,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:22,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:22,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:23,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:25,336 INFO [train.py:1039] (0/4) Epoch 18, batch 350, loss[loss=0.156, simple_loss=0.234, pruned_loss=0.03903, over 24460.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2549, pruned_loss=0.05299, over 3920937.87 frames. ], batch size: 58, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:02:27,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=604373.3333333334, ans=0.1 2023-09-30 05:02:28,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:28,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:02:30,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:35,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:39,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:40,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:43,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 05:02:43,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=604440.0, ans=0.125 2023-09-30 05:02:44,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:46,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 05:02:47,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:47,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 05:02:49,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:51,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 05:02:52,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:02:54,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:55,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:57,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:02:57,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:57,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:03:00,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:00,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:09,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:09,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:03:10,021 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.77 vs. limit=8.0 2023-09-30 05:03:10,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:03:10,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:10,892 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=604506.6666666666, ans=0.125 2023-09-30 05:03:16,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 05:03:16,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:22,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:22,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:22,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:03:24,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 05:03:27,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:27,214 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 05:03:30,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 05:03:30,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:33,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:03:33,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 05:03:36,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:36,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=604640.0, ans=0.0 2023-09-30 05:03:39,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:03:40,065 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=604640.0, ans=0.1 2023-09-30 05:03:42,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:43,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:43,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:46,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:46,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=604706.6666666666, ans=0.1 2023-09-30 05:03:47,507 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.62 vs. limit=15.0 2023-09-30 05:03:47,933 INFO [train.py:1039] (0/4) Epoch 18, batch 400, loss[loss=0.2081, simple_loss=0.2893, pruned_loss=0.06347, over 24394.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2548, pruned_loss=0.05323, over 4096845.00 frames. ], batch size: 77, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:03:49,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:52,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=604706.6666666666, ans=0.125 2023-09-30 05:03:53,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:03:53,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 05:03:53,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:55,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:55,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:03:57,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:03:58,493 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.749e+02 1.897e+02 2.088e+02 3.470e+02, threshold=3.794e+02, percent-clipped=0.0 2023-09-30 05:04:00,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:01,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:03,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 05:04:04,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 05:04:04,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:06,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 05:04:06,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:11,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:04:11,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:12,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 05:04:12,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:04:12,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:12,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:15,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:04:17,320 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 05:04:18,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 05:04:19,730 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-09-30 05:04:23,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:23,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:25,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 05:04:27,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 05:04:30,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:04:32,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:38,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 05:04:40,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:04:41,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 05:04:43,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:46,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:04:46,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 05:04:50,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:04:53,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:04:55,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:57,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:59,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 05:05:02,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:05:02,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 05:05:05,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:05:05,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:05:07,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 05:05:10,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:05:10,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:05:10,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:05:11,907 INFO [train.py:1039] (0/4) Epoch 18, batch 450, loss[loss=0.1695, simple_loss=0.2516, pruned_loss=0.04365, over 24653.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2554, pruned_loss=0.05372, over 4223845.30 frames. ], batch size: 65, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:05:12,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 05:05:13,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:05:13,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:05:13,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:05:13,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 05:05:15,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:05:16,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:05:18,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:05:21,280 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=605040.0, ans=0.035 2023-09-30 05:05:28,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:29,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:05:31,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 05:05:33,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 05:05:37,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:05:40,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:40,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:40,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=605106.6666666666, ans=0.0 2023-09-30 05:05:43,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:44,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:46,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 05:05:48,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 05:05:48,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 05:05:49,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:05:49,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:49,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:05:50,734 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.11 vs. limit=15.0 2023-09-30 05:05:51,486 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 05:05:51,500 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 05:05:53,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:54,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:05:56,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:06:01,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:06:01,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:06:02,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:06:03,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 05:06:05,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:08,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:06:08,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:06:11,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 05:06:13,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=605240.0, ans=0.0 2023-09-30 05:06:15,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=605240.0, ans=10.0 2023-09-30 05:06:16,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:06:16,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 05:06:18,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 05:06:18,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:24,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:06:26,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:28,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:06:29,586 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 05:06:33,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:33,504 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=605373.3333333334, ans=0.04949747468305833 2023-09-30 05:06:34,109 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.70 vs. limit=15.0 2023-09-30 05:06:34,649 INFO [train.py:1039] (0/4) Epoch 18, batch 500, loss[loss=0.1944, simple_loss=0.2537, pruned_loss=0.06756, over 23788.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2561, pruned_loss=0.05358, over 4339477.79 frames. ], batch size: 164, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:06:34,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:06:36,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:36,357 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 05:06:37,861 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.56 vs. limit=15.0 2023-09-30 05:06:38,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 05:06:38,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:38,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=605373.3333333334, ans=0.015 2023-09-30 05:06:44,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:06:47,412 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.876e+02 2.083e+02 2.292e+02 4.355e+02, threshold=4.166e+02, percent-clipped=1.0 2023-09-30 05:06:48,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:06:50,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:06:53,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:53,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:55,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:00,331 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=605440.0, ans=0.125 2023-09-30 05:07:03,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:07:03,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:07:03,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 05:07:04,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:07:07,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:07:07,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:07:08,067 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=605506.6666666666, ans=0.2 2023-09-30 05:07:09,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:07:09,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:11,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 05:07:13,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=605506.6666666666, ans=0.1 2023-09-30 05:07:14,952 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 05:07:17,283 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.24 vs. limit=10.0 2023-09-30 05:07:18,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:20,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:07:24,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 05:07:27,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:07:28,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:32,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:07:35,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:42,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:46,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 05:07:46,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:46,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:50,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 05:07:52,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:07:55,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:55,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=605640.0, ans=0.125 2023-09-30 05:07:58,204 INFO [train.py:1039] (0/4) Epoch 18, batch 550, loss[loss=0.1959, simple_loss=0.2799, pruned_loss=0.05601, over 24374.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2567, pruned_loss=0.05366, over 4429504.45 frames. ], batch size: 77, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:08:01,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 05:08:02,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 05:08:02,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:02,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 05:08:02,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:08:02,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:04,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:08:06,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:08:09,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:08:09,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 05:08:09,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:08:13,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:13,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:17,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:18,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:27,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 05:08:28,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 05:08:30,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:08:36,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:08:37,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:38,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:08:41,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:41,557 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 05:08:41,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:42,344 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=15.0 2023-09-30 05:08:43,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:08:46,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:46,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:08:46,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:08:49,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:50,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 05:08:52,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 05:08:52,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:08:52,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:54,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:08:54,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:57,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:08:59,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:09:02,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:09:04,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:05,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=605973.3333333334, ans=0.125 2023-09-30 05:09:06,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 05:09:06,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:09:08,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:08,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:09:10,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:10,929 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.45 vs. limit=6.0 2023-09-30 05:09:11,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:09:13,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:09:18,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 05:09:19,458 INFO [train.py:1039] (0/4) Epoch 18, batch 600, loss[loss=0.1785, simple_loss=0.2688, pruned_loss=0.04411, over 24450.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2559, pruned_loss=0.05355, over 4505638.26 frames. ], batch size: 69, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:09:22,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 05:09:24,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:09:24,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:09:24,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:30,823 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.862e+02 2.137e+02 2.512e+02 3.782e+02, threshold=4.274e+02, percent-clipped=0.0 2023-09-30 05:09:31,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:09:34,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:09:34,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 05:09:36,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:09:39,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:09:41,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:43,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 05:09:43,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:09:48,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=606106.6666666666, ans=0.125 2023-09-30 05:09:49,157 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.14 vs. limit=8.0 2023-09-30 05:09:49,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 05:09:54,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:09:54,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:54,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:09:54,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=606173.3333333334, ans=0.125 2023-09-30 05:10:02,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:10:02,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:10:02,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:04,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=606173.3333333334, ans=0.125 2023-09-30 05:10:10,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:10:12,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:12,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:10:12,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:10:21,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 05:10:25,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:10:25,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:10:30,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 05:10:32,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:10:35,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 05:10:35,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:10:35,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:10:39,336 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=606306.6666666666, ans=0.04949747468305833 2023-09-30 05:10:42,005 INFO [train.py:1039] (0/4) Epoch 18, batch 650, loss[loss=0.1654, simple_loss=0.2266, pruned_loss=0.05208, over 23590.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2549, pruned_loss=0.05347, over 4539066.91 frames. ], batch size: 256, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:10:43,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:10:45,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:10:47,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:10:48,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:10:51,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:10:54,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 05:10:55,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:56,136 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=606373.3333333334, ans=0.2 2023-09-30 05:11:00,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:11:00,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:03,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:10,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 05:11:12,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:12,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:15,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:15,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:11:18,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:18,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:18,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:11:20,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:22,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:11:24,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:11:25,658 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 05:11:25,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:25,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:28,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:28,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:29,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:29,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=606506.6666666666, ans=0.125 2023-09-30 05:11:30,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:11:30,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 05:11:32,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:11:32,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:11:33,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:11:33,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:36,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:11:36,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=606573.3333333334, ans=0.125 2023-09-30 05:11:38,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 05:11:40,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 05:11:40,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:40,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:40,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:11:40,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=606573.3333333334, ans=0.0 2023-09-30 05:11:41,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:43,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:49,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:50,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:50,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:54,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:54,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:11:54,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:54,991 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=606640.0, ans=0.125 2023-09-30 05:12:04,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:12:04,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:04,158 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:04,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:05,541 INFO [train.py:1039] (0/4) Epoch 18, batch 700, loss[loss=0.1696, simple_loss=0.2505, pruned_loss=0.04434, over 24602.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2537, pruned_loss=0.05288, over 4571886.49 frames. ], batch size: 60, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:12:10,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 05:12:11,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 05:12:14,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 05:12:16,480 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.793e+02 1.992e+02 2.237e+02 3.434e+02, threshold=3.985e+02, percent-clipped=0.0 2023-09-30 05:12:16,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:16,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:12:20,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 05:12:20,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=606773.3333333334, ans=0.09899494936611666 2023-09-30 05:12:25,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:26,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:12:30,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:30,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:12:32,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:12:35,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:35,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=606773.3333333334, ans=0.125 2023-09-30 05:12:37,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:12:37,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:12:38,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 05:12:41,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 05:12:46,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:12:46,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:12:47,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:12:48,604 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.31 vs. limit=22.5 2023-09-30 05:12:53,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:12:53,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 05:12:59,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:59,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:13:01,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 05:13:01,492 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=606906.6666666666, ans=0.125 2023-09-30 05:13:03,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:13:04,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:06,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=606906.6666666666, ans=0.2 2023-09-30 05:13:08,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:08,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=606906.6666666666, ans=0.2 2023-09-30 05:13:14,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:13:14,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 05:13:15,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 05:13:15,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 05:13:20,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:22,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:23,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:13:26,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:26,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 05:13:27,557 INFO [train.py:1039] (0/4) Epoch 18, batch 750, loss[loss=0.188, simple_loss=0.2654, pruned_loss=0.05529, over 23349.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2536, pruned_loss=0.05261, over 4595282.34 frames. ], batch size: 93, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:13:28,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=607040.0, ans=0.125 2023-09-30 05:13:30,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 05:13:30,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 05:13:32,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 05:13:32,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 05:13:33,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 05:13:34,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:13:35,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 05:13:36,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:37,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:13:39,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:42,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:42,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:13:42,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:45,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=607106.6666666666, ans=0.125 2023-09-30 05:13:46,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:13:48,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:13:49,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:13:52,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:52,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:54,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 05:13:55,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:13:55,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:57,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:14:01,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:14:01,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 05:14:02,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:04,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 05:14:04,721 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 05:14:04,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 05:14:04,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:14:06,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:14:07,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:14:13,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:14:13,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:15,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:14:18,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:14:19,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:19,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 05:14:19,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:14:20,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=607240.0, ans=0.125 2023-09-30 05:14:21,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 05:14:22,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:14:25,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:14:27,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 05:14:27,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:27,717 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=607240.0, ans=0.1 2023-09-30 05:14:32,962 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.min_positive, batch_count=607306.6666666666, ans=0.05 2023-09-30 05:14:34,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:14:34,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:14:35,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:14:37,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:14:39,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=607306.6666666666, ans=0.035 2023-09-30 05:14:42,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 05:14:42,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:42,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:49,883 INFO [train.py:1039] (0/4) Epoch 18, batch 800, loss[loss=0.168, simple_loss=0.2496, pruned_loss=0.04321, over 24664.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2536, pruned_loss=0.05256, over 4623217.41 frames. ], batch size: 65, lr: 5.79e-03, grad_scale: 32.0 2023-09-30 05:14:50,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:50,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:14:57,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:57,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:58,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=607373.3333333334, ans=0.07 2023-09-30 05:14:59,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:59,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:59,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:59,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:00,869 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.976e+02 2.273e+02 2.818e+02 4.949e+02, threshold=4.546e+02, percent-clipped=4.0 2023-09-30 05:15:01,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:06,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:06,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:15:10,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 05:15:11,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:13,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:15:13,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:15:13,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:14,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 05:15:14,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:14,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 05:15:17,861 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=607440.0, ans=0.125 2023-09-30 05:15:19,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:22,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:24,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:15:24,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:27,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:28,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:31,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:15:31,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:15:33,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 05:15:34,978 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 05:15:35,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 05:15:35,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:15:35,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:15:37,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:38,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:15:43,179 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 05:15:43,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 05:15:46,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:15:48,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:15:52,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:15:56,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:58,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 05:15:58,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:16:03,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 05:16:10,362 INFO [train.py:1039] (0/4) Epoch 18, batch 850, loss[loss=0.1868, simple_loss=0.2727, pruned_loss=0.05045, over 24317.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2555, pruned_loss=0.05308, over 4649261.58 frames. ], batch size: 74, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:16:10,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:13,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:16:13,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 05:16:13,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:16:15,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:17,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 05:16:17,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=607706.6666666666, ans=0.125 2023-09-30 05:16:18,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:18,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:16:20,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:21,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:16:22,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:16:24,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 05:16:24,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 05:16:24,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 05:16:27,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:27,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:16:29,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:29,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:30,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:16:36,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:36,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:16:36,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 05:16:40,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 05:16:43,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:45,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 05:16:50,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 05:16:50,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 05:16:50,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=607840.0, ans=0.125 2023-09-30 05:16:54,221 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 05:16:54,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:16:54,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:16:54,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:16:55,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:57,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:58,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 05:17:00,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:17:01,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:03,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:17:04,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:17:04,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=607906.6666666666, ans=0.125 2023-09-30 05:17:05,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:17:07,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:17:08,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 05:17:13,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:17:13,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:15,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:17:15,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:15,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:18,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:17:21,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:17:23,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:17:23,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:23,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:17:33,412 INFO [train.py:1039] (0/4) Epoch 18, batch 900, loss[loss=0.1945, simple_loss=0.2628, pruned_loss=0.06312, over 23432.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.256, pruned_loss=0.0532, over 4670817.50 frames. ], batch size: 134, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:17:33,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:17:34,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:36,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 05:17:36,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:36,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:38,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 05:17:42,756 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.48 vs. limit=15.0 2023-09-30 05:17:46,197 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.851e+02 2.083e+02 2.531e+02 4.017e+02, threshold=4.166e+02, percent-clipped=0.0 2023-09-30 05:17:46,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:17:48,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:49,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 05:17:52,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:17:52,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 05:17:53,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:17:54,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:54,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:17:56,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:17:56,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:18:00,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=608106.6666666666, ans=0.125 2023-09-30 05:18:01,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=608106.6666666666, ans=0.125 2023-09-30 05:18:07,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:07,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:18:08,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:18:09,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=608173.3333333334, ans=0.0 2023-09-30 05:18:11,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:16,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 05:18:16,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:18:21,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:18:23,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:18:24,615 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 05:18:26,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 05:18:27,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=608240.0, ans=0.125 2023-09-30 05:18:30,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:18:30,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:18:31,133 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=608240.0, ans=0.125 2023-09-30 05:18:33,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:18:39,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:39,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:18:40,761 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.19 vs. limit=15.0 2023-09-30 05:18:41,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 05:18:41,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:43,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 05:18:44,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=608306.6666666666, ans=0.125 2023-09-30 05:18:46,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:18:46,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:48,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:18:49,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:18:53,432 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.43 vs. limit=15.0 2023-09-30 05:18:54,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 05:18:54,388 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 05:18:56,280 INFO [train.py:1039] (0/4) Epoch 18, batch 950, loss[loss=0.1739, simple_loss=0.2545, pruned_loss=0.04659, over 24480.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2566, pruned_loss=0.05335, over 4680401.53 frames. ], batch size: 63, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:18:58,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:18:58,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 05:18:59,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:02,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 05:19:08,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:11,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:11,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:12,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:19:15,074 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 05:19:20,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:20,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:20,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=608440.0, ans=0.05 2023-09-30 05:19:21,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:21,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:19:21,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 05:19:23,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:19:23,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=608440.0, ans=0.0 2023-09-30 05:19:25,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:26,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 05:19:26,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:33,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:33,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:33,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:34,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 05:19:36,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 05:19:38,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:38,416 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=608506.6666666666, ans=0.125 2023-09-30 05:19:39,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:19:44,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:19:44,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:48,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 05:19:51,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:19:51,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:19:51,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:19:51,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:51,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:19:55,531 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=608573.3333333334, ans=0.1 2023-09-30 05:19:56,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 05:19:58,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:20:01,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:01,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:01,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 05:20:01,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:01,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:20:03,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 05:20:08,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:20:08,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:15,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:15,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 05:20:15,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 05:20:19,775 INFO [train.py:1039] (0/4) Epoch 18, batch 1000, loss[loss=0.1763, simple_loss=0.2466, pruned_loss=0.05301, over 23529.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2549, pruned_loss=0.05302, over 4675319.04 frames. ], batch size: 149, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:20:19,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:25,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 05:20:25,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:20:30,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:20:33,039 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.036e+02 2.242e+02 3.072e+02 5.531e+02, threshold=4.484e+02, percent-clipped=11.0 2023-09-30 05:20:33,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 05:20:33,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 05:20:36,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:36,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:36,966 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.14 vs. limit=12.0 2023-09-30 05:20:38,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:43,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 05:20:46,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 05:20:47,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 05:20:48,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:20:50,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 05:20:51,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 05:20:51,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 05:20:53,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:53,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=608840.0, ans=0.125 2023-09-30 05:20:55,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:20:55,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=608840.0, ans=0.2 2023-09-30 05:21:03,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:03,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:21:05,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:05,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=608840.0, ans=10.0 2023-09-30 05:21:06,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:06,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 05:21:06,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:06,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:21:08,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:08,424 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 05:21:13,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 05:21:14,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 05:21:15,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 05:21:17,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:21:17,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=608906.6666666666, ans=0.5 2023-09-30 05:21:25,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:26,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:21:26,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:28,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:21:29,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 05:21:31,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:21:33,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 05:21:33,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 05:21:36,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:21:36,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:38,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:21:41,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:21:42,864 INFO [train.py:1039] (0/4) Epoch 18, batch 1050, loss[loss=0.1859, simple_loss=0.2656, pruned_loss=0.0531, over 24628.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2539, pruned_loss=0.05272, over 4683584.61 frames. ], batch size: 68, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:21:43,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:47,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:21:47,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:21:49,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:21:51,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:53,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:21:56,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:21:58,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:21:59,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=609106.6666666666, ans=0.2 2023-09-30 05:22:01,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:22:01,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:22:01,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:22:01,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=609106.6666666666, ans=0.0 2023-09-30 05:22:02,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:22:02,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 05:22:05,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:05,700 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.35 vs. limit=15.0 2023-09-30 05:22:06,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 05:22:09,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:22:09,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 05:22:09,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:22:17,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:22:17,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:22:17,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:18,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=609173.3333333334, ans=0.0 2023-09-30 05:22:20,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 05:22:21,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 05:22:21,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:22:26,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 05:22:29,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 05:22:31,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:22:34,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:22:36,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:22:36,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:22:37,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:22:38,356 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=609240.0, ans=0.125 2023-09-30 05:22:40,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:22:45,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 05:22:47,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 05:22:47,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 05:22:47,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:47,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=609240.0, ans=0.0 2023-09-30 05:22:47,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=609240.0, ans=0.125 2023-09-30 05:22:48,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:22:50,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 05:22:53,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:22:55,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:55,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:22:56,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:22:56,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:01,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:01,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 05:23:02,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:23:02,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 05:23:04,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 05:23:05,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:23:06,948 INFO [train.py:1039] (0/4) Epoch 18, batch 1100, loss[loss=0.1624, simple_loss=0.2455, pruned_loss=0.03965, over 24592.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2535, pruned_loss=0.05245, over 4693326.06 frames. ], batch size: 60, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:23:08,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:14,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:23:20,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:23:22,023 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.922e+02 2.115e+02 2.641e+02 4.048e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 05:23:22,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:23:22,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:22,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 05:23:25,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:23:26,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:23:28,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:23:31,303 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.29 vs. limit=15.0 2023-09-30 05:23:31,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:23:31,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 05:23:33,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:23:35,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:35,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:23:37,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:23:39,382 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.66 vs. limit=10.0 2023-09-30 05:23:40,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:23:42,502 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=609506.6666666666, ans=0.125 2023-09-30 05:23:45,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:23:47,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 05:23:49,233 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 05:23:49,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:23:54,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:54,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 05:23:56,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:23:56,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:23:56,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:23:56,625 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.10 vs. limit=6.0 2023-09-30 05:23:57,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:57,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 05:24:05,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:24:05,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 05:24:06,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:24:12,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:24:15,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 05:24:15,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:24:17,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:20,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:22,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:22,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 05:24:23,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:24:23,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:25,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 05:24:25,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:24:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 05:24:27,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:24:27,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:24:28,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:24:30,508 INFO [train.py:1039] (0/4) Epoch 18, batch 1150, loss[loss=0.1938, simple_loss=0.2606, pruned_loss=0.06353, over 23609.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2541, pruned_loss=0.05271, over 4697438.27 frames. ], batch size: 256, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:24:33,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:35,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:24:38,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:38,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:24:38,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 05:24:38,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:24:42,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 05:24:43,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:43,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:24:49,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 05:24:52,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:56,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:56,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:24:56,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 05:24:57,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:24:57,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:25:02,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 05:25:02,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:04,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:25:15,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:19,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=609906.6666666666, ans=0.125 2023-09-30 05:25:23,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:23,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 05:25:23,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:24,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:30,979 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 05:25:33,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:36,226 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.99 vs. limit=22.5 2023-09-30 05:25:40,789 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 05:25:45,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:25:46,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:25:46,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:25:48,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:25:49,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:25:53,410 INFO [train.py:1039] (0/4) Epoch 18, batch 1200, loss[loss=0.1737, simple_loss=0.2624, pruned_loss=0.0425, over 24532.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2559, pruned_loss=0.05346, over 4692639.37 frames. ], batch size: 71, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:25:57,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:25:57,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:25:58,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:58,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:00,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:26:00,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:26:03,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:26:06,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:06,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:08,275 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.757e+02 1.911e+02 2.190e+02 3.350e+02, threshold=3.822e+02, percent-clipped=0.0 2023-09-30 05:26:10,011 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 05:26:11,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 05:26:14,213 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.17 vs. limit=22.5 2023-09-30 05:26:18,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:26:19,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:26:22,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:24,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:26:24,453 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 05:26:26,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:27,028 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=610173.3333333334, ans=0.125 2023-09-30 05:26:33,506 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=610173.3333333334, ans=0.015 2023-09-30 05:26:34,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:26:34,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:26:34,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 05:26:36,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:26:39,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 05:26:44,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 05:26:44,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:46,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:47,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:26:47,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:26:48,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:49,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:26:51,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:26:51,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 05:26:51,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:26:52,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:26:52,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:26:54,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:55,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:27:00,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:27:02,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:27:06,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 05:27:11,354 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 05:27:14,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:15,616 INFO [train.py:1039] (0/4) Epoch 18, batch 1250, loss[loss=0.1692, simple_loss=0.2517, pruned_loss=0.04333, over 24317.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2575, pruned_loss=0.05467, over 4682166.15 frames. ], batch size: 61, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:27:17,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:27:18,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:27:21,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:27:24,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 05:27:27,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:27:28,219 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=610373.3333333334, ans=0.125 2023-09-30 05:27:29,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:29,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 05:27:29,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=610373.3333333334, ans=0.09899494936611666 2023-09-30 05:27:31,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:27:34,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:27:37,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:27:37,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:38,296 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=610440.0, ans=0.0 2023-09-30 05:27:40,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:27:40,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:42,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:27:47,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:27:47,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:27:47,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:49,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:49,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:27:50,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:27:54,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:27:59,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 05:28:00,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:28:01,362 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=610506.6666666666, ans=0.1 2023-09-30 05:28:02,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:03,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 05:28:05,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:28:05,545 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 05:28:07,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:07,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:10,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:13,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:13,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:28:15,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 05:28:15,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 05:28:15,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 05:28:15,637 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=610573.3333333334, ans=0.125 2023-09-30 05:28:18,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:20,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=610640.0, ans=0.0 2023-09-30 05:28:21,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 05:28:21,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:21,832 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=610640.0, ans=0.125 2023-09-30 05:28:22,073 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=610640.0, ans=0.0 2023-09-30 05:28:23,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:28:25,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:28:27,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 05:28:27,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:28:28,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:28:28,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:28:28,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:30,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 05:28:34,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:35,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:28:35,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:28:38,666 INFO [train.py:1039] (0/4) Epoch 18, batch 1300, loss[loss=0.1645, simple_loss=0.244, pruned_loss=0.04246, over 24630.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2578, pruned_loss=0.05458, over 4669469.72 frames. ], batch size: 60, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:28:38,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:28:41,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:41,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 05:28:45,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:48,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:28:48,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:28:52,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:52,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:28:53,671 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.896e+02 2.139e+02 2.447e+02 3.795e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-30 05:28:53,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 05:28:59,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:28:59,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:28:59,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=610773.3333333334, ans=0.125 2023-09-30 05:29:00,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 05:29:05,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:29:10,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:10,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:12,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:29:13,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:15,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:29:16,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:29:17,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 05:29:19,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=610840.0, ans=10.0 2023-09-30 05:29:24,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:29:24,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:29:26,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 05:29:26,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:29:27,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:29:30,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:29:31,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 05:29:33,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:33,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 05:29:34,716 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.46 vs. limit=15.0 2023-09-30 05:29:35,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:40,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:40,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:29:43,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 05:29:45,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 05:29:45,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 05:29:49,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:29:53,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 05:29:54,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:56,899 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=610973.3333333334, ans=0.09899494936611666 2023-09-30 05:30:01,234 INFO [train.py:1039] (0/4) Epoch 18, batch 1350, loss[loss=0.189, simple_loss=0.2612, pruned_loss=0.05838, over 23367.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2565, pruned_loss=0.05416, over 4678184.09 frames. ], batch size: 93, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:30:01,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 05:30:07,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:08,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:14,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:30:14,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:16,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:30:18,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:21,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:23,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 05:30:24,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:30:24,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:30:27,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 05:30:29,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:30:31,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:30:31,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 05:30:32,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 05:30:33,504 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.69 vs. limit=6.0 2023-09-30 05:30:34,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 05:30:35,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:35,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 05:30:41,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=611173.3333333334, ans=0.1 2023-09-30 05:30:43,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=611173.3333333334, ans=0.0 2023-09-30 05:30:49,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:53,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=611240.0, ans=0.0 2023-09-30 05:30:59,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:59,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:30:59,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 05:31:02,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:03,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 05:31:04,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:31:04,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:31:07,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:31:09,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 05:31:09,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=611306.6666666666, ans=0.0 2023-09-30 05:31:11,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:31:13,567 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=611306.6666666666, ans=0.0 2023-09-30 05:31:17,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 05:31:18,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=611306.6666666666, ans=0.0 2023-09-30 05:31:21,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 05:31:24,496 INFO [train.py:1039] (0/4) Epoch 18, batch 1400, loss[loss=0.1609, simple_loss=0.2382, pruned_loss=0.04177, over 24584.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2558, pruned_loss=0.05404, over 4685885.98 frames. ], batch size: 60, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:31:26,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 05:31:26,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:29,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:31:31,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:31:35,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 05:31:37,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 05:31:38,680 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.909e+02 2.025e+02 2.334e+02 3.516e+02, threshold=4.051e+02, percent-clipped=0.0 2023-09-30 05:31:48,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:31:51,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:31:54,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:31:54,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:31:59,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:32:00,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:32:08,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:09,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:12,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 05:32:14,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:32:14,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:32:16,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:32:16,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:20,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:32:20,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:32:21,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:32:21,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 05:32:21,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:32:23,787 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.21 vs. limit=22.5 2023-09-30 05:32:27,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:31,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:32:39,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 05:32:40,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:32:40,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:32:43,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:32:44,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:45,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:32:46,986 INFO [train.py:1039] (0/4) Epoch 18, batch 1450, loss[loss=0.2007, simple_loss=0.2634, pruned_loss=0.06903, over 23707.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2549, pruned_loss=0.05316, over 4694458.85 frames. ], batch size: 164, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:32:48,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:32:52,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:32:52,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:52,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:32:58,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:58,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:32:59,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:59,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 05:33:00,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=611706.6666666666, ans=0.0 2023-09-30 05:33:01,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:33:03,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 05:33:03,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=611773.3333333334, ans=0.07 2023-09-30 05:33:04,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:05,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:05,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 05:33:06,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:07,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:33:08,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 05:33:08,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:10,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:33:12,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:14,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:15,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=611773.3333333334, ans=0.0 2023-09-30 05:33:18,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:33:18,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:33:19,257 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.79 vs. limit=15.0 2023-09-30 05:33:21,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:33:21,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:24,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:33:24,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:30,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 05:33:31,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:36,448 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 05:33:38,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:40,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:33:40,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:41,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 05:33:45,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=611906.6666666666, ans=0.0 2023-09-30 05:33:46,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:47,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 05:33:49,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 05:33:50,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:53,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:33:53,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:55,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 05:33:59,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 05:33:59,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 05:34:01,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:05,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:34:09,824 INFO [train.py:1039] (0/4) Epoch 18, batch 1500, loss[loss=0.1805, simple_loss=0.2648, pruned_loss=0.04812, over 24391.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2548, pruned_loss=0.05346, over 4692445.11 frames. ], batch size: 69, lr: 5.77e-03, grad_scale: 8.0 2023-09-30 05:34:15,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 05:34:15,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:34:15,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:34:15,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:16,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:18,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:34:18,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 05:34:20,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:34:21,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:34:21,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:22,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:34:24,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:34:24,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:25,932 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.869e+02 2.047e+02 2.380e+02 3.622e+02, threshold=4.094e+02, percent-clipped=0.0 2023-09-30 05:34:32,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:32,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 05:34:34,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:34:34,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:34:35,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:41,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 05:34:46,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 05:34:46,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:47,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 05:34:48,562 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-09-30 05:34:49,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:34:50,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=612173.3333333334, ans=0.125 2023-09-30 05:34:50,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=612173.3333333334, ans=0.0 2023-09-30 05:34:51,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:34:53,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:53,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:34:54,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 05:34:54,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:34:55,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:34:55,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 05:34:56,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:34:57,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=612240.0, ans=0.1 2023-09-30 05:35:03,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:35:03,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 05:35:08,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:35:11,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:35:16,361 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 05:35:16,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:16,440 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 05:35:16,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=612306.6666666666, ans=0.0 2023-09-30 05:35:18,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:18,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:35:19,694 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 05:35:21,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:35:24,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 05:35:26,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:30,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:30,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:30,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:31,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:32,357 INFO [train.py:1039] (0/4) Epoch 18, batch 1550, loss[loss=0.171, simple_loss=0.2538, pruned_loss=0.04413, over 24490.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2554, pruned_loss=0.05345, over 4696971.70 frames. ], batch size: 66, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:35:32,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:35:32,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 05:35:34,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 05:35:34,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:35:35,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 05:35:35,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 05:35:35,883 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=612373.3333333334, ans=0.0 2023-09-30 05:35:38,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:40,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:40,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=612373.3333333334, ans=0.09899494936611666 2023-09-30 05:35:41,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:35:41,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:35:41,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=612373.3333333334, ans=0.1 2023-09-30 05:35:42,008 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=612373.3333333334, ans=0.125 2023-09-30 05:35:43,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:45,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:46,782 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 05:35:46,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:46,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:35:49,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:35:52,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:35:52,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 05:35:52,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:53,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 05:35:53,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 05:35:53,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 05:35:55,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:57,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:00,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:36:03,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 05:36:04,454 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.00 vs. limit=10.0 2023-09-30 05:36:04,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 05:36:12,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:14,668 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=612506.6666666666, ans=0.1 2023-09-30 05:36:15,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:36:15,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:36:16,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:36:17,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 05:36:21,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:36:24,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:27,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:36:30,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:36:30,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:30,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 05:36:30,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:33,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:36:33,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:35,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:36:35,134 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 05:36:36,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:36:41,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=612640.0, ans=0.125 2023-09-30 05:36:44,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 05:36:47,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:49,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:51,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 05:36:52,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:54,073 INFO [train.py:1039] (0/4) Epoch 18, batch 1600, loss[loss=0.2067, simple_loss=0.2715, pruned_loss=0.07101, over 23541.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2571, pruned_loss=0.05426, over 4700213.37 frames. ], batch size: 285, lr: 5.76e-03, grad_scale: 16.0 2023-09-30 05:36:54,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:56,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:36:56,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:36:57,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:36:59,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=612706.6666666666, ans=0.0 2023-09-30 05:37:01,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:01,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 05:37:03,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 05:37:05,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 05:37:08,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:09,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 05:37:11,285 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.776e+02 1.986e+02 2.218e+02 2.701e+02, threshold=3.972e+02, percent-clipped=0.0 2023-09-30 05:37:11,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:37:13,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:37:16,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:37:20,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 05:37:22,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:37:23,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 05:37:23,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:23,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=612773.3333333334, ans=0.0 2023-09-30 05:37:24,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 05:37:29,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 05:37:36,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:36,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 05:37:38,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:38,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:38,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:37:40,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=612840.0, ans=0.125 2023-09-30 05:37:41,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 05:37:41,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=612840.0, ans=0.04949747468305833 2023-09-30 05:37:46,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 05:37:47,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=612906.6666666666, ans=0.09899494936611666 2023-09-30 05:37:49,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:37:50,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:50,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:52,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:37:53,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:37:55,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:37:57,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:37:59,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=612973.3333333334, ans=0.125 2023-09-30 05:38:06,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:06,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:38:08,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=612973.3333333334, ans=0.125 2023-09-30 05:38:10,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 05:38:10,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:38:12,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 05:38:16,813 INFO [train.py:1039] (0/4) Epoch 18, batch 1650, loss[loss=0.1971, simple_loss=0.2699, pruned_loss=0.06218, over 23491.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.258, pruned_loss=0.05439, over 4701073.67 frames. ], batch size: 134, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:38:17,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:19,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:38:21,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:38:21,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 05:38:21,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 05:38:21,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 05:38:21,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 05:38:26,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:26,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:27,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:38:27,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:38:29,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:32,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 05:38:35,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:38:35,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:35,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:38:35,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:38:37,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 05:38:37,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 05:38:42,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=613106.6666666666, ans=0.125 2023-09-30 05:38:44,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:38:46,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:38:53,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=613173.3333333334, ans=0.1 2023-09-30 05:38:54,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 05:38:54,920 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=613173.3333333334, ans=0.0 2023-09-30 05:38:56,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:38:57,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 05:39:01,441 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.58 vs. limit=15.0 2023-09-30 05:39:02,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:03,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:39:06,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:39:06,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:06,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:39:06,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:07,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:09,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:09,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:09,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:11,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:11,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=613240.0, ans=0.125 2023-09-30 05:39:13,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:39:16,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:18,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 05:39:19,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:21,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 05:39:21,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 05:39:21,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 05:39:21,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:23,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:39:23,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:24,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:24,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 05:39:24,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=613306.6666666666, ans=0.1 2023-09-30 05:39:25,235 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.46 vs. limit=15.0 2023-09-30 05:39:27,029 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=613306.6666666666, ans=0.09899494936611666 2023-09-30 05:39:28,456 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-92000.pt 2023-09-30 05:39:31,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:33,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:39:33,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:36,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=613306.6666666666, ans=0.025 2023-09-30 05:39:37,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 05:39:41,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:41,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:39:42,836 INFO [train.py:1039] (0/4) Epoch 18, batch 1700, loss[loss=0.1823, simple_loss=0.2695, pruned_loss=0.04755, over 24629.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.257, pruned_loss=0.05363, over 4707235.00 frames. ], batch size: 68, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:39:42,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 05:39:43,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:39:43,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:39:43,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:44,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:39:46,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:39:46,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 05:39:50,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:40:00,300 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.875e+02 2.086e+02 2.365e+02 3.763e+02, threshold=4.171e+02, percent-clipped=0.0 2023-09-30 05:40:00,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:02,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:40:02,331 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=613440.0, ans=0.0 2023-09-30 05:40:06,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=613440.0, ans=0.125 2023-09-30 05:40:08,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:40:10,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:10,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:40:11,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:14,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 05:40:15,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:40:16,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:16,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:40:18,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:40:20,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 05:40:20,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 05:40:23,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:25,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 05:40:25,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=613506.6666666666, ans=0.1 2023-09-30 05:40:26,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:40:37,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:38,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:39,585 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.80 vs. limit=22.5 2023-09-30 05:40:40,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:41,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:40:41,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 05:40:41,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:43,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:43,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 05:40:44,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:40:44,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:44,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:44,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:40:46,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:46,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:40:48,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:49,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:40:49,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:54,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:56,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 05:40:57,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=613640.0, ans=0.1 2023-09-30 05:40:58,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:58,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:01,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 05:41:05,585 INFO [train.py:1039] (0/4) Epoch 18, batch 1750, loss[loss=0.1801, simple_loss=0.266, pruned_loss=0.04713, over 24473.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2559, pruned_loss=0.05302, over 4719272.89 frames. ], batch size: 69, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:41:08,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:11,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:11,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:41:11,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 05:41:13,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:41:16,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:41:16,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:21,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 05:41:23,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:24,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 05:41:24,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:26,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:41:28,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:41:31,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 05:41:32,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:41:34,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 05:41:41,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:41:45,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:41:45,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:49,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:49,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:52,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:54,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:56,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:41:58,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:59,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 05:41:59,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:42:04,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 05:42:04,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:05,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:07,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:42:11,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:42:11,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:42:13,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:15,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:18,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:21,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:23,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:42:23,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 05:42:23,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:24,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:42:24,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:24,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:42:24,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:42:27,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:42:28,512 INFO [train.py:1039] (0/4) Epoch 18, batch 1800, loss[loss=0.1635, simple_loss=0.2471, pruned_loss=0.03996, over 24674.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2552, pruned_loss=0.05277, over 4716872.50 frames. ], batch size: 65, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:42:30,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:42:31,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:32,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=614040.0, ans=0.0 2023-09-30 05:42:33,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:42:36,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:39,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:42:41,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:42:45,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:42:46,621 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.378e+02 1.840e+02 1.957e+02 2.222e+02 3.420e+02, threshold=3.915e+02, percent-clipped=0.0 2023-09-30 05:42:48,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:48,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:50,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:42:52,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:52,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 05:42:54,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:42:55,164 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=614106.6666666666, ans=0.125 2023-09-30 05:42:58,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:01,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 05:43:01,702 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=614173.3333333334, ans=0.0 2023-09-30 05:43:03,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 05:43:03,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 05:43:05,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:06,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:43:06,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:07,413 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.88 vs. limit=15.0 2023-09-30 05:43:08,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:43:10,620 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.01 vs. limit=15.0 2023-09-30 05:43:12,940 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 05:43:14,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:43:16,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:17,061 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=614240.0, ans=0.07 2023-09-30 05:43:18,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 05:43:19,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 05:43:21,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:43:22,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:43:23,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:43:30,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 05:43:34,031 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.83 vs. limit=22.5 2023-09-30 05:43:37,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:43:37,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 05:43:39,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:43:39,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:39,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:43:41,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 05:43:42,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=614306.6666666666, ans=0.1 2023-09-30 05:43:44,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:43:44,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:43:46,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 05:43:46,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:46,559 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=614306.6666666666, ans=0.0 2023-09-30 05:43:49,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:49,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:43:49,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:50,777 INFO [train.py:1039] (0/4) Epoch 18, batch 1850, loss[loss=0.17, simple_loss=0.2554, pruned_loss=0.0423, over 24526.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2552, pruned_loss=0.05268, over 4725632.65 frames. ], batch size: 66, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:43:52,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:52,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:43:54,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:55,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:58,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:43:58,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:06,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:44:07,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 05:44:09,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 05:44:14,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 05:44:17,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:17,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 05:44:17,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:44:27,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:44:30,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 05:44:33,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:44:33,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:44:35,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=614506.6666666666, ans=0.125 2023-09-30 05:44:38,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 05:44:38,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:39,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:44:39,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:44:41,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:44,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:44:46,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:44:48,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:48,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:44:48,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:49,774 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=614573.3333333334, ans=0.125 2023-09-30 05:44:49,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=614573.3333333334, ans=0.0 2023-09-30 05:44:51,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:52,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:44:55,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 05:44:55,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:58,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:44:58,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:44:58,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 05:44:58,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 05:45:02,870 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 05:45:04,263 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 05:45:04,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:45:04,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:45:06,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:06,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:06,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=614640.0, ans=0.0 2023-09-30 05:45:07,985 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 05:45:07,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:45:08,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:08,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=614640.0, ans=0.0 2023-09-30 05:45:09,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:45:09,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:45:11,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:45:11,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 05:45:14,123 INFO [train.py:1039] (0/4) Epoch 18, batch 1900, loss[loss=0.1772, simple_loss=0.2632, pruned_loss=0.04556, over 24676.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2555, pruned_loss=0.05269, over 4727683.32 frames. ], batch size: 68, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:45:14,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:14,319 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 05:45:14,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:45:15,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:19,859 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=614706.6666666666, ans=0.125 2023-09-30 05:45:21,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:24,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:45:24,226 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 05:45:24,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 05:45:25,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:27,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:45:27,331 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 05:45:27,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=614706.6666666666, ans=0.0 2023-09-30 05:45:28,842 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 05:45:29,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=614773.3333333334, ans=0.2 2023-09-30 05:45:30,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=614773.3333333334, ans=0.0 2023-09-30 05:45:31,887 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.898e+02 2.159e+02 2.520e+02 3.634e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-30 05:45:33,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 05:45:35,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:45:39,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 05:45:40,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 05:45:41,188 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.46 vs. limit=15.0 2023-09-30 05:45:48,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 05:45:52,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 05:45:54,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:54,165 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 05:45:54,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 05:45:54,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 05:45:54,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 05:45:54,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:45:58,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 05:46:01,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:46:07,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:07,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 05:46:07,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:46:09,777 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=614906.6666666666, ans=0.125 2023-09-30 05:46:11,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 05:46:11,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:19,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:46:19,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:46:19,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:46:19,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:46:20,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:46:22,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:46:22,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:46:26,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:26,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:46:29,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:46:29,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:29,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:29,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=614973.3333333334, ans=0.125 2023-09-30 05:46:30,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:33,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:36,672 INFO [train.py:1039] (0/4) Epoch 18, batch 1950, loss[loss=0.1821, simple_loss=0.2472, pruned_loss=0.05849, over 23705.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2558, pruned_loss=0.05277, over 4734332.96 frames. ], batch size: 164, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:46:36,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:46:38,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:38,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:46:40,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 05:46:40,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:46:40,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:42,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:45,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:46:45,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:46:45,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:49,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:46:50,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:50,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:46:50,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:46:52,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:54,135 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=615106.6666666666, ans=0.125 2023-09-30 05:46:56,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:00,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:47:00,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:00,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:47:00,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 05:47:01,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:47:01,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:47:02,139 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=615106.6666666666, ans=0.125 2023-09-30 05:47:03,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:08,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:09,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:47:10,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=615173.3333333334, ans=0.125 2023-09-30 05:47:16,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:47:19,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:47:20,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:47:21,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 05:47:22,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:47:23,748 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.31 vs. limit=6.0 2023-09-30 05:47:26,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:47:28,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:47:29,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:30,251 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=615240.0, ans=0.04949747468305833 2023-09-30 05:47:34,814 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=615240.0, ans=0.0 2023-09-30 05:47:39,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:39,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:40,387 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-09-30 05:47:40,415 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.07 vs. limit=12.0 2023-09-30 05:47:41,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:44,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:46,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:47:47,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:47,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 05:47:47,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:47:48,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:51,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 05:47:52,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:47:57,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:58,909 INFO [train.py:1039] (0/4) Epoch 18, batch 2000, loss[loss=0.1747, simple_loss=0.2546, pruned_loss=0.04737, over 24498.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2568, pruned_loss=0.05305, over 4728389.17 frames. ], batch size: 63, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:47:59,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:48:00,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:03,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:48:04,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:09,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 05:48:09,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:48:12,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:48:16,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 05:48:16,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:48:17,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:48:19,240 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.900e+02 2.106e+02 2.415e+02 3.499e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 05:48:20,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:48:22,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 05:48:24,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,399 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=615440.0, ans=0.2 2023-09-30 05:48:29,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 05:48:29,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:48:30,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 05:48:30,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:34,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:48:36,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:48:36,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:37,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:39,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:48:39,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 05:48:39,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=615506.6666666666, ans=0.2 2023-09-30 05:48:43,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 05:48:43,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:43,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:48:49,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:50,012 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=615573.3333333334, ans=0.0 2023-09-30 05:48:51,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:48:51,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:51,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:54,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:55,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:55,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:55,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:55,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:00,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:49:00,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 05:49:07,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:49:08,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:49:12,681 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:49:15,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:16,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:17,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:19,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:49:19,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:49:20,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:22,090 INFO [train.py:1039] (0/4) Epoch 18, batch 2050, loss[loss=0.1666, simple_loss=0.2562, pruned_loss=0.03852, over 24563.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2564, pruned_loss=0.05274, over 4738308.73 frames. ], batch size: 71, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:49:22,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:25,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:27,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:31,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:49:35,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:49:36,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:38,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:49:40,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 05:49:40,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:49:40,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:49:41,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:49:44,821 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.13 vs. limit=15.0 2023-09-30 05:49:51,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:49:52,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:53,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 05:49:55,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:58,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 05:49:58,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:50:03,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:04,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:05,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:50:06,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:06,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:50:07,317 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.38 vs. limit=12.0 2023-09-30 05:50:08,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:50:09,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:50:14,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:16,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:50:18,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:50:18,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:50:23,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:28,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:50:30,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 05:50:35,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:37,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:50:40,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:50:42,546 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.58 vs. limit=15.0 2023-09-30 05:50:42,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 05:50:44,565 INFO [train.py:1039] (0/4) Epoch 18, batch 2100, loss[loss=0.1692, simple_loss=0.2486, pruned_loss=0.04489, over 24310.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2552, pruned_loss=0.05273, over 4720405.01 frames. ], batch size: 61, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:50:46,748 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 05:50:46,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:48,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:48,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:50:49,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:49,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 05:50:51,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 05:50:53,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:55,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=616040.0, ans=0.125 2023-09-30 05:50:56,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:50:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:50:58,280 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=616040.0, ans=0.95 2023-09-30 05:50:59,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:59,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:00,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 05:51:01,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:51:01,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 05:51:01,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 05:51:03,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:03,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:03,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 05:51:04,695 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.913e+02 2.117e+02 2.530e+02 3.526e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 05:51:04,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 05:51:10,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 05:51:10,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:51:14,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:51:14,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:51:14,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=616106.6666666666, ans=0.0 2023-09-30 05:51:17,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:51:19,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 05:51:21,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:21,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:51:22,094 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=10.19 vs. limit=15.0 2023-09-30 05:51:22,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 05:51:22,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:22,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 05:51:24,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 05:51:24,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 05:51:28,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:51:29,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:51:31,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:34,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:36,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:36,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=616240.0, ans=0.125 2023-09-30 05:51:39,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 05:51:39,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:39,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:41,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 05:51:42,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 05:51:42,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 05:51:47,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:51:50,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:50,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 05:51:55,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:57,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:51:59,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:59,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:51:59,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:51:59,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:01,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:02,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:52:02,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:52:02,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:04,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 05:52:05,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 05:52:05,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:07,384 INFO [train.py:1039] (0/4) Epoch 18, batch 2150, loss[loss=0.1707, simple_loss=0.2477, pruned_loss=0.04684, over 20817.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.254, pruned_loss=0.05206, over 4717837.70 frames. ], batch size: 45, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:52:08,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:52:08,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:52:09,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:52:09,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:52:16,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:52:17,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:19,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:21,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:52:21,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:21,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:52:25,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:27,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:52:27,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:52:30,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:30,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 05:52:36,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:37,613 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:52:39,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:39,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:52:40,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:40,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:52:40,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:42,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 05:52:43,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:52:45,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:46,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:47,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:47,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:52:50,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:50,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:52:53,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:53,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 05:52:53,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:52:57,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:58,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:58,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:53:00,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:53:00,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:00,397 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=616573.3333333334, ans=0.1 2023-09-30 05:53:01,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:01,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 05:53:05,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 05:53:05,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:53:05,881 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 05:53:05,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:06,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:53:07,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 05:53:07,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:53:07,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 05:53:07,398 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 05:53:07,398 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 05:53:07,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 05:53:07,632 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=616573.3333333334, ans=0.125 2023-09-30 05:53:07,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=616573.3333333334, ans=0.1 2023-09-30 05:53:10,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:10,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:53:10,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:53:12,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:13,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:53:15,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:15,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:18,155 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.36 vs. limit=22.5 2023-09-30 05:53:19,729 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=616640.0, ans=0.125 2023-09-30 05:53:24,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:53:24,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 05:53:29,814 INFO [train.py:1039] (0/4) Epoch 18, batch 2200, loss[loss=0.2296, simple_loss=0.2902, pruned_loss=0.08447, over 19494.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2544, pruned_loss=0.05255, over 4708200.42 frames. ], batch size: 388, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:53:29,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:53:31,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=616706.6666666666, ans=0.0 2023-09-30 05:53:32,528 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=22.5 2023-09-30 05:53:33,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:33,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:53:34,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:53:36,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:53:40,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:40,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:53:40,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 05:53:46,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 05:53:47,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:53:49,927 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.966e+02 2.292e+02 2.651e+02 4.144e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-30 05:53:56,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 05:53:59,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:00,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:02,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:54:03,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:54:05,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 05:54:09,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:54:11,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:11,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:54:15,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:54:15,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=616840.0, ans=0.04949747468305833 2023-09-30 05:54:18,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:21,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:54:22,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:23,126 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=616906.6666666666, ans=0.0 2023-09-30 05:54:24,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 05:54:25,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:28,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 05:54:30,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:30,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:54:30,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:30,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=616906.6666666666, ans=0.125 2023-09-30 05:54:32,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:33,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:33,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:33,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:35,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:54:35,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:54:36,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:54:39,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:54:39,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:54:40,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=616973.3333333334, ans=0.05 2023-09-30 05:54:42,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:54:44,425 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 05:54:44,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:54:46,759 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 05:54:46,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:54:46,978 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 05:54:50,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:50,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:54:52,043 INFO [train.py:1039] (0/4) Epoch 18, batch 2250, loss[loss=0.1899, simple_loss=0.2592, pruned_loss=0.06034, over 23336.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2549, pruned_loss=0.05286, over 4696857.61 frames. ], batch size: 119, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:54:52,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:53,748 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 05:54:54,086 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=617040.0, ans=0.0 2023-09-30 05:54:55,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:54:57,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:55:03,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:55:05,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=617040.0, ans=0.0 2023-09-30 05:55:05,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=617040.0, ans=0.05 2023-09-30 05:55:06,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:55:09,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:10,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:10,386 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=617106.6666666666, ans=0.2 2023-09-30 05:55:11,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:55:14,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 05:55:14,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:14,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:55:17,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 05:55:19,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:55:19,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:20,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:26,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:26,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=617173.3333333334, ans=0.125 2023-09-30 05:55:27,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:55:29,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:55:30,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 05:55:30,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:34,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:55:39,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:41,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:42,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:55:42,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:44,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:45,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:55:47,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=617240.0, ans=0.125 2023-09-30 05:55:48,006 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.85 vs. limit=15.0 2023-09-30 05:55:50,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:55:53,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:55:57,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=617306.6666666666, ans=0.125 2023-09-30 05:55:59,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:55:59,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:55:59,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:55:59,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=617306.6666666666, ans=0.05 2023-09-30 05:56:04,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:56:07,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:56:07,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 05:56:07,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:09,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:56:12,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 05:56:14,354 INFO [train.py:1039] (0/4) Epoch 18, batch 2300, loss[loss=0.1712, simple_loss=0.2451, pruned_loss=0.04866, over 23652.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2556, pruned_loss=0.05286, over 4709257.21 frames. ], batch size: 149, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:56:14,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:56:14,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:19,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=617373.3333333334, ans=0.125 2023-09-30 05:56:20,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:20,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:56:23,514 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 05:56:26,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:33,608 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.928e+02 2.258e+02 2.853e+02 4.796e+02, threshold=4.517e+02, percent-clipped=2.0 2023-09-30 05:56:33,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:56:33,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:56:33,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:56:33,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:33,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 05:56:35,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:56:38,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:56:38,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:56:40,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:56:42,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:56:47,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:56:52,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:56:53,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:54,498 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.92 vs. limit=15.0 2023-09-30 05:56:55,443 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=617506.6666666666, ans=0.1 2023-09-30 05:56:56,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:57:00,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:03,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:57:05,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:57:05,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:57:05,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 05:57:10,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:57:10,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:10,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:10,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:57:12,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:12,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:57:12,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:57:13,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 05:57:13,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:57:13,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:13,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 05:57:22,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:57:27,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:57:29,381 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.07 vs. limit=12.0 2023-09-30 05:57:30,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:30,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:57:31,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:57:35,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:57:35,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:57:35,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:57:35,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 05:57:35,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=617706.6666666666, ans=0.125 2023-09-30 05:57:37,199 INFO [train.py:1039] (0/4) Epoch 18, batch 2350, loss[loss=0.1815, simple_loss=0.2579, pruned_loss=0.05255, over 23973.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2567, pruned_loss=0.05346, over 4714429.76 frames. ], batch size: 86, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:57:42,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:57:42,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 05:57:43,076 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=617706.6666666666, ans=0.1 2023-09-30 05:57:47,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 05:57:49,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:52,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:57:54,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:56,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 05:57:57,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:58:01,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 05:58:02,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:58:05,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=617773.3333333334, ans=0.1 2023-09-30 05:58:06,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:58:06,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:58:11,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:58:13,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 05:58:14,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:58:14,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:58:14,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:16,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:58:19,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:58:22,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 05:58:22,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:58:25,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:58:26,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:58:31,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 05:58:31,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:58:34,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 05:58:34,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:58:40,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 05:58:43,044 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.18 vs. limit=12.0 2023-09-30 05:58:43,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 05:58:45,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:45,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:58:45,375 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 05:58:45,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 05:58:48,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 05:58:50,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:58:56,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:58:59,657 INFO [train.py:1039] (0/4) Epoch 18, batch 2400, loss[loss=0.1634, simple_loss=0.2384, pruned_loss=0.04425, over 24337.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2555, pruned_loss=0.05329, over 4713867.67 frames. ], batch size: 61, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 05:58:59,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:59:02,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:59:03,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 05:59:03,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=618040.0, ans=10.0 2023-09-30 05:59:04,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 05:59:11,116 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.59 vs. limit=10.0 2023-09-30 05:59:13,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:59:13,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:59:14,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 05:59:17,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:59:18,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:18,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 05:59:20,010 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.950e+02 2.202e+02 2.533e+02 3.814e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-30 05:59:25,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:26,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 05:59:31,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:59:34,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 05:59:38,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:59:42,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:46,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:59:46,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 05:59:46,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:59:52,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=618240.0, ans=0.1 2023-09-30 05:59:52,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=618240.0, ans=0.0 2023-09-30 05:59:58,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:01,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:03,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:03,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=618240.0, ans=0.2 2023-09-30 06:00:06,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:00:06,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:00:06,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:00:06,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:06,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:06,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:00:11,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:11,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:00:12,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 06:00:12,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 06:00:14,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:00:16,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:16,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 06:00:16,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 06:00:16,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 06:00:16,666 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 06:00:18,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 06:00:20,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:00:20,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:21,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:23,278 INFO [train.py:1039] (0/4) Epoch 18, batch 2450, loss[loss=0.1695, simple_loss=0.2518, pruned_loss=0.04359, over 24511.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2539, pruned_loss=0.05244, over 4705589.15 frames. ], batch size: 66, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:00:23,388 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 06:00:24,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:24,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:00:28,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:00:28,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:33,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:33,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:34,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=618373.3333333334, ans=0.0 2023-09-30 06:00:34,646 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.90 vs. limit=6.0 2023-09-30 06:00:35,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 06:00:36,222 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.04 vs. limit=15.0 2023-09-30 06:00:38,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:38,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:43,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:00:43,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:00:43,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:00:44,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 06:00:50,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:51,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:00:53,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:56,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:00:56,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:01:01,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 06:01:03,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:01:05,560 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=618506.6666666666, ans=0.0 2023-09-30 06:01:07,509 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:01:08,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:10,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:01:11,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:11,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:01:12,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:13,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:01:14,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 06:01:20,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:01:20,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:01:20,715 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.55 vs. limit=15.0 2023-09-30 06:01:24,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:01:24,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:31,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:01:31,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 06:01:32,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:01:32,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:01:32,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 06:01:34,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:01:34,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:01:39,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:01:41,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:43,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:01:46,480 INFO [train.py:1039] (0/4) Epoch 18, batch 2500, loss[loss=0.1928, simple_loss=0.2566, pruned_loss=0.06451, over 23756.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2529, pruned_loss=0.0524, over 4701402.99 frames. ], batch size: 164, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:01:46,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 06:01:48,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:01:54,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:03,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:02:04,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:02:04,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:04,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 06:02:06,275 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.833e+02 2.012e+02 2.316e+02 3.261e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 06:02:08,619 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.02 vs. limit=6.0 2023-09-30 06:02:13,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:02:13,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:14,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:02:14,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:02:14,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 06:02:18,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:18,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:19,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 06:02:19,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:20,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 06:02:20,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:24,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:02:26,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:27,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:02:29,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 06:02:31,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:02:32,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:38,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:41,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:41,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=618906.6666666666, ans=0.125 2023-09-30 06:02:45,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:02:49,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:02:53,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 06:02:53,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:53,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:02:55,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=618973.3333333334, ans=0.125 2023-09-30 06:02:56,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:02:56,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:02:56,597 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 06:02:56,598 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 06:02:56,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 06:03:01,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:04,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 06:03:04,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 06:03:04,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:03:04,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 06:03:05,257 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=618973.3333333334, ans=0.125 2023-09-30 06:03:09,325 INFO [train.py:1039] (0/4) Epoch 18, batch 2550, loss[loss=0.1915, simple_loss=0.2683, pruned_loss=0.05733, over 23950.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2539, pruned_loss=0.05278, over 4707443.53 frames. ], batch size: 86, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:03:09,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 06:03:11,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:12,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:03:13,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:03:16,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:16,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 06:03:17,541 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.03 vs. limit=10.0 2023-09-30 06:03:18,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:03:21,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 06:03:23,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:03:26,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:28,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:03:28,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 06:03:29,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:03:29,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:31,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:33,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:03:34,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 06:03:34,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:03:34,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:34,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 06:03:48,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:03:53,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:03:53,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:53,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:55,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:03:58,596 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=619240.0, ans=0.09899494936611666 2023-09-30 06:04:01,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:04:06,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:04:06,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:04:06,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:04:07,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:04:07,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:04:12,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:12,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:14,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=619306.6666666666, ans=0.0 2023-09-30 06:04:16,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=619306.6666666666, ans=0.0 2023-09-30 06:04:16,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=619306.6666666666, ans=0.0 2023-09-30 06:04:17,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:04:17,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 06:04:17,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:04:19,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:20,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:04:20,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:04:22,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:31,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:04:32,410 INFO [train.py:1039] (0/4) Epoch 18, batch 2600, loss[loss=0.1753, simple_loss=0.2442, pruned_loss=0.05323, over 23592.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2546, pruned_loss=0.05312, over 4709960.21 frames. ], batch size: 134, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:04:32,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:35,691 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 06:04:36,419 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.37 vs. limit=15.0 2023-09-30 06:04:38,639 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 06:04:38,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:04:40,690 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 06:04:40,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 06:04:40,854 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 06:04:44,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:44,776 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 06:04:46,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 06:04:47,859 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 06:04:49,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:04:51,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 06:04:52,580 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.862e+02 2.048e+02 2.291e+02 3.453e+02, threshold=4.097e+02, percent-clipped=0.0 2023-09-30 06:04:52,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 06:04:54,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:04:54,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 06:04:55,938 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 06:04:57,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 06:05:04,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:04,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:04,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:04,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 06:05:04,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=619506.6666666666, ans=0.125 2023-09-30 06:05:07,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:05:09,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=619506.6666666666, ans=0.125 2023-09-30 06:05:15,988 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 06:05:24,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:24,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:25,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 06:05:25,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:25,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:27,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 06:05:30,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:05:30,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:05:33,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:38,620 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 06:05:38,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:38,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:05:43,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:45,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:05:45,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 06:05:45,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:50,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:05:50,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:05:54,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 06:05:55,847 INFO [train.py:1039] (0/4) Epoch 18, batch 2650, loss[loss=0.1546, simple_loss=0.2398, pruned_loss=0.03472, over 24459.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2554, pruned_loss=0.05301, over 4723240.14 frames. ], batch size: 66, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:05:56,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:57,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:05:57,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=619706.6666666666, ans=0.125 2023-09-30 06:06:00,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 06:06:00,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:02,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:06:02,458 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 06:06:02,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:06,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:08,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:06:10,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:06:12,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:06:13,301 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.39 vs. limit=12.0 2023-09-30 06:06:13,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 06:06:13,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:06:13,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:06:18,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 06:06:20,306 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 06:06:23,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:26,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 06:06:28,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:28,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 06:06:33,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:33,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:06:33,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:34,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:39,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 06:06:39,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 06:06:41,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:06:44,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 06:06:44,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:46,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:46,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:06:47,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:48,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:51,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:53,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:06:54,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:54,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:06:56,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:06:57,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:59,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:06:59,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:01,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:07:02,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:07:05,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:06,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:07:06,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:06,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 06:07:10,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:07:11,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:14,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:15,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:16,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:07:16,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:17,530 INFO [train.py:1039] (0/4) Epoch 18, batch 2700, loss[loss=0.1948, simple_loss=0.2753, pruned_loss=0.05711, over 24392.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2561, pruned_loss=0.05329, over 4711180.55 frames. ], batch size: 77, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:07:18,208 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.85 vs. limit=15.0 2023-09-30 06:07:19,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:19,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 06:07:21,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:07:24,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:07:26,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:07:26,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:07:28,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:28,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:07:29,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:07:29,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 06:07:29,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:07:31,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:07:33,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:07:33,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=620106.6666666666, ans=0.2 2023-09-30 06:07:35,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:38,509 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.935e+02 2.170e+02 2.390e+02 3.266e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 06:07:38,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:07:40,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 06:07:40,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:07:40,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=620106.6666666666, ans=0.0 2023-09-30 06:07:42,624 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.32 vs. limit=15.0 2023-09-30 06:07:47,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:07:47,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:07:49,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=620173.3333333334, ans=0.05 2023-09-30 06:07:52,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:07:53,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:53,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:07:53,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:07:56,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:01,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:01,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:08:01,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:04,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:04,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:08:06,914 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=620240.0, ans=0.2 2023-09-30 06:08:14,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:08:16,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:08:18,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:08:18,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:18,397 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:08:21,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:22,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:22,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:23,583 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.28 vs. limit=15.0 2023-09-30 06:08:25,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:27,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:27,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:08:28,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=620306.6666666666, ans=0.125 2023-09-30 06:08:30,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:08:32,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:32,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:36,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 06:08:36,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:39,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:08:39,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 06:08:41,307 INFO [train.py:1039] (0/4) Epoch 18, batch 2750, loss[loss=0.1663, simple_loss=0.2227, pruned_loss=0.05495, over 22758.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2558, pruned_loss=0.05375, over 4705074.50 frames. ], batch size: 322, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:08:41,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 06:08:41,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:45,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:08:45,401 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=620373.3333333334, ans=0.1 2023-09-30 06:08:46,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:49,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:49,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:08:49,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:52,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:08:54,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:08:55,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:08:55,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:55,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 06:08:55,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:55,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:09:01,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 06:09:02,268 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=620440.0, ans=0.0 2023-09-30 06:09:03,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:09:03,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:05,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:07,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:09:07,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:09:07,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:09:09,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:10,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:15,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:09:15,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:09:15,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:09:17,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:18,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:09:25,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:28,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:09:28,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=620573.3333333334, ans=0.0 2023-09-30 06:09:29,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:32,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:32,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:09:32,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:09:39,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:09:41,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:41,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 06:09:46,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:47,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 06:09:53,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:09:54,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:09:56,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 06:09:58,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:09:59,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:09:59,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 06:09:59,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:10:02,952 INFO [train.py:1039] (0/4) Epoch 18, batch 2800, loss[loss=0.1564, simple_loss=0.2335, pruned_loss=0.03968, over 22399.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2549, pruned_loss=0.05324, over 4715947.13 frames. ], batch size: 49, lr: 5.73e-03, grad_scale: 32.0 2023-09-30 06:10:03,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:10:03,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:03,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:04,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 06:10:04,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:06,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:06,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:07,100 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.05 vs. limit=15.0 2023-09-30 06:10:07,827 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 06:10:07,828 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 06:10:11,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:13,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:10:13,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:10:18,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:10:20,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 06:10:21,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:10:23,095 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.862e+02 2.010e+02 2.282e+02 3.813e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-30 06:10:23,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 06:10:24,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:24,953 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:10:24,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:29,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:30,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:30,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:10:31,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:10:40,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:10:42,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:45,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:47,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:10:47,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:54,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:10:54,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 06:10:55,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:56,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:56,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:10:59,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=620906.6666666666, ans=0.125 2023-09-30 06:11:01,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:01,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:03,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=620906.6666666666, ans=0.2 2023-09-30 06:11:04,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:11:06,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:11:06,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:06,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:11:07,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:11:09,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:11:10,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:11:10,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 06:11:10,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:11,091 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=620973.3333333334, ans=0.125 2023-09-30 06:11:12,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:11:12,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:15,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 06:11:15,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:15,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:11:17,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:11:19,439 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.04 vs. limit=10.0 2023-09-30 06:11:20,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 06:11:20,766 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=12.0 2023-09-30 06:11:25,215 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.09 vs. limit=15.0 2023-09-30 06:11:25,721 INFO [train.py:1039] (0/4) Epoch 18, batch 2850, loss[loss=0.166, simple_loss=0.2463, pruned_loss=0.04287, over 24673.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2536, pruned_loss=0.05263, over 4719918.36 frames. ], batch size: 65, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:11:27,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:11:27,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:11:28,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:11:30,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:34,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:11:34,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:11:35,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:39,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:40,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:42,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:11:42,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 06:11:42,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=621106.6666666666, ans=22.5 2023-09-30 06:11:48,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 06:11:48,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:50,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 06:11:50,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:52,371 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.37 vs. limit=15.0 2023-09-30 06:11:55,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 06:11:55,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 06:11:56,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:02,671 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=621173.3333333334, ans=0.125 2023-09-30 06:12:11,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:12,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:12,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:12:14,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:12:14,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:12:14,417 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:12:17,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:12:17,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 06:12:19,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:12:19,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:19,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:19,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:22,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:22,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:23,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:25,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:27,214 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=621240.0, ans=0.125 2023-09-30 06:12:28,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:12:28,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:28,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:28,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_abs, batch_count=621240.0, ans=0.5 2023-09-30 06:12:31,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:12:37,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:12:40,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 06:12:40,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 06:12:42,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:12:43,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:43,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 06:12:44,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:12:44,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:46,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:46,134 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:12:46,135 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 06:12:47,641 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 06:12:47,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:12:47,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:49,180 INFO [train.py:1039] (0/4) Epoch 18, batch 2900, loss[loss=0.1607, simple_loss=0.2351, pruned_loss=0.04317, over 24422.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2539, pruned_loss=0.05244, over 4724194.53 frames. ], batch size: 58, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:12:52,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:12:53,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:53,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:55,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 06:12:58,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:58,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 06:13:00,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 06:13:01,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:13:01,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:13:03,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:05,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:13:08,436 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.787e+02 2.109e+02 2.458e+02 3.664e+02, threshold=4.218e+02, percent-clipped=0.0 2023-09-30 06:13:10,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:13:10,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:13:12,366 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=621440.0, ans=0.125 2023-09-30 06:13:13,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:13:13,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 06:13:13,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:13:16,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:17,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 06:13:19,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 06:13:22,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:13:22,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 06:13:22,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:13:25,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:13:25,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:13:27,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=621506.6666666666, ans=0.1 2023-09-30 06:13:28,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:28,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:33,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:13:36,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:13:37,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 06:13:37,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 06:13:37,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:13:43,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:13:48,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 06:13:50,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:13:54,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:14:02,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:14:02,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:14:03,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 06:14:08,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:08,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 06:14:08,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:08,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:14:10,967 INFO [train.py:1039] (0/4) Epoch 18, batch 2950, loss[loss=0.1893, simple_loss=0.2614, pruned_loss=0.05858, over 23856.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2548, pruned_loss=0.05279, over 4711494.64 frames. ], batch size: 179, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:14:14,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:16,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 06:14:18,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:20,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:20,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:14:22,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:14:23,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 06:14:24,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 06:14:25,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:14:25,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:31,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:33,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:14:36,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:14:38,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:41,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:14:41,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:14:44,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:14:45,070 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten.whitening_limit, batch_count=621840.0, ans=15.0 2023-09-30 06:14:46,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 06:14:53,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 06:14:53,329 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 06:14:54,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:14:56,271 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 06:14:58,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 06:14:58,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:58,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:58,546 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 06:14:58,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:15:03,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 06:15:03,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:15:03,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:15:06,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:08,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:15:08,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:08,567 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 06:15:09,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:10,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 06:15:14,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:16,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:15:16,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 06:15:16,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:15:19,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 06:15:22,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:23,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=621973.3333333334, ans=0.0 2023-09-30 06:15:25,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:15:25,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:15:28,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:28,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:15:29,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:15:29,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=621973.3333333334, ans=0.125 2023-09-30 06:15:31,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:31,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:15:31,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:15:33,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:34,752 INFO [train.py:1039] (0/4) Epoch 18, batch 3000, loss[loss=0.1919, simple_loss=0.2726, pruned_loss=0.05564, over 23985.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2556, pruned_loss=0.0528, over 4724799.83 frames. ], batch size: 86, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:15:34,753 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 06:15:49,338 INFO [train.py:1071] (0/4) Epoch 18, validation: loss=0.3403, simple_loss=0.2856, pruned_loss=0.1975, over 1125622.00 frames. 2023-09-30 06:15:49,339 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 06:15:49,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:15:51,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:51,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 06:15:52,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:56,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:15:56,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:16:01,463 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 06:16:01,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 06:16:03,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:16:03,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:16:03,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 06:16:04,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:09,646 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.852e+02 2.145e+02 2.482e+02 3.954e+02, threshold=4.290e+02, percent-clipped=0.0 2023-09-30 06:16:12,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:16:22,104 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.59 vs. limit=15.0 2023-09-30 06:16:22,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:16:28,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 06:16:31,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:16:34,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:16:36,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:36,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:16:37,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:37,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 06:16:37,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 06:16:41,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:16:41,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:16:44,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:16:44,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:44,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:16:44,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:16:46,839 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.78 vs. limit=6.0 2023-09-30 06:16:49,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:16:49,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:49,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:16:50,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:53,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 06:16:54,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:16:54,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:16:56,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:16:56,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=622306.6666666666, ans=0.1 2023-09-30 06:16:59,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:00,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:02,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:17:02,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 06:17:02,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:02,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 06:17:03,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=622306.6666666666, ans=0.125 2023-09-30 06:17:04,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:17:07,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 06:17:08,226 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=622306.6666666666, ans=0.2 2023-09-30 06:17:10,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:12,321 INFO [train.py:1039] (0/4) Epoch 18, batch 3050, loss[loss=0.1758, simple_loss=0.2487, pruned_loss=0.05146, over 23729.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2557, pruned_loss=0.05255, over 4729795.37 frames. ], batch size: 135, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:17:12,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:17:12,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 06:17:13,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 06:17:13,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:17:15,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:17:15,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:15,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:17:16,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:16,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:17:20,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 06:17:22,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:17:25,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:25,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:17:30,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:33,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 06:17:40,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 06:17:40,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 06:17:41,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:17:43,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:17:44,277 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=622506.6666666666, ans=0.1 2023-09-30 06:17:47,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:48,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:48,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:53,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:17:53,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:53,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:54,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:54,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:54,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=622506.6666666666, ans=0.125 2023-09-30 06:17:56,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:58,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:01,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:02,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 06:18:02,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:18:03,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:18:05,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:18:05,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:18:05,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:06,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:10,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:18:10,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:12,448 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=622573.3333333334, ans=0.025 2023-09-30 06:18:17,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:17,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:18:17,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:22,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:22,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:18:22,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:18:24,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 06:18:25,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:26,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:27,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 06:18:29,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:34,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:35,973 INFO [train.py:1039] (0/4) Epoch 18, batch 3100, loss[loss=0.1785, simple_loss=0.2604, pruned_loss=0.04828, over 24450.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2556, pruned_loss=0.05205, over 4736214.95 frames. ], batch size: 77, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:18:37,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:18:39,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:18:40,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 06:18:42,760 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.00 vs. limit=10.0 2023-09-30 06:18:45,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 06:18:47,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 06:18:49,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:18:52,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:52,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:55,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:18:57,049 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.806e+02 2.048e+02 2.293e+02 3.321e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 06:18:57,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=622773.3333333334, ans=0.125 2023-09-30 06:18:58,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:04,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 06:19:08,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:19:08,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:08,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:09,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:19:10,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:19:13,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:19:13,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 06:19:13,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:19:13,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:17,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 06:19:18,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:19:22,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:19:22,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=622840.0, ans=0.0 2023-09-30 06:19:23,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 06:19:24,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 06:19:25,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:25,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:28,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:28,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:28,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:19:30,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:19:30,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:19:33,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:19:33,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:19:33,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:33,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:19:39,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:40,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 06:19:40,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=622973.3333333334, ans=0.2 2023-09-30 06:19:43,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:19:43,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 06:19:45,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:45,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:45,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 06:19:45,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=622973.3333333334, ans=0.2 2023-09-30 06:19:57,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 06:19:58,739 INFO [train.py:1039] (0/4) Epoch 18, batch 3150, loss[loss=0.2009, simple_loss=0.2634, pruned_loss=0.06923, over 23933.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2553, pruned_loss=0.05198, over 4736906.27 frames. ], batch size: 195, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:20:00,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:00,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:03,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:20:03,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:20:04,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 06:20:05,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:05,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:20:05,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=623040.0, ans=0.2 2023-09-30 06:20:06,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 06:20:08,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:09,879 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 06:20:14,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 06:20:14,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:20:15,245 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=623106.6666666666, ans=0.125 2023-09-30 06:20:16,373 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 06:20:18,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:20:18,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 06:20:20,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 06:20:20,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 06:20:20,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:20,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:21,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:23,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 06:20:27,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:30,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:20:33,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 06:20:33,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:20:36,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:20:38,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:38,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 06:20:41,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 06:20:43,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:20:43,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:20:43,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:20:44,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:44,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:20:44,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:20:46,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:20:47,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 06:20:49,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:20:49,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:52,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:20:52,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:52,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 06:20:54,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:20:56,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 06:20:56,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:58,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 06:20:58,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 06:21:01,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:21:01,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:04,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 06:21:04,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 06:21:05,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:21:08,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:21:10,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:10,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:21:12,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=12.0 2023-09-30 06:21:15,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:21:16,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:18,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 06:21:21,427 INFO [train.py:1039] (0/4) Epoch 18, batch 3200, loss[loss=0.1581, simple_loss=0.2297, pruned_loss=0.04331, over 20306.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2541, pruned_loss=0.05175, over 4738495.15 frames. ], batch size: 43, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:21:23,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:21:23,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:21:25,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=623373.3333333334, ans=0.0 2023-09-30 06:21:28,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:30,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:21:30,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 06:21:34,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:39,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:21:42,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:42,544 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=623440.0, ans=0.0 2023-09-30 06:21:43,650 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.425e+02 1.833e+02 1.996e+02 2.319e+02 3.127e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-30 06:21:50,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:21:59,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 06:22:01,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:22:03,669 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=623506.6666666666, ans=0.2 2023-09-30 06:22:04,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 06:22:04,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:22:08,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:22:08,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:22:10,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:22:15,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 06:22:17,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:22:20,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 06:22:21,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 06:22:24,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:22:32,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:22:32,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,916 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 06:22:32,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:22:37,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:22:39,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 06:22:41,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 06:22:41,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 06:22:43,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 06:22:44,459 INFO [train.py:1039] (0/4) Epoch 18, batch 3250, loss[loss=0.1804, simple_loss=0.2518, pruned_loss=0.05449, over 23505.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2549, pruned_loss=0.05213, over 4740645.28 frames. ], batch size: 134, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:22:44,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:22:48,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:22:48,480 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 06:22:48,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:22:48,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:22:50,017 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 06:22:53,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:22:56,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:23:05,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:05,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 06:23:07,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:07,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:23:07,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:08,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:08,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:23:09,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=623773.3333333334, ans=0.2 2023-09-30 06:23:13,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:23:13,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:13,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:14,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:23:17,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:19,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:21,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:21,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:22,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:24,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:24,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:28,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 06:23:30,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:23:30,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:23:30,875 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=623840.0, ans=0.125 2023-09-30 06:23:32,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:32,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:23:37,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:23:37,575 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=623906.6666666666, ans=0.125 2023-09-30 06:23:39,686 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=623906.6666666666, ans=0.1 2023-09-30 06:23:46,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:23:46,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:46,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 06:23:46,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:23:46,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:23:46,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:50,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 06:23:51,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 06:23:51,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:53,144 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.42 vs. limit=15.0 2023-09-30 06:23:54,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:56,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:56,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:23:57,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:24:00,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:00,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:01,515 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.95 vs. limit=15.0 2023-09-30 06:24:02,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 06:24:03,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:04,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=623973.3333333334, ans=0.0 2023-09-30 06:24:06,741 INFO [train.py:1039] (0/4) Epoch 18, batch 3300, loss[loss=0.1901, simple_loss=0.2599, pruned_loss=0.06009, over 23762.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2549, pruned_loss=0.05224, over 4736054.05 frames. ], batch size: 232, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:24:06,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:24:06,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 06:24:09,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:24:09,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 06:24:12,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 06:24:12,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 06:24:13,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:18,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:19,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:24:19,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:22,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:24:22,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:24:26,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:26,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:30,035 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.896e+02 2.095e+02 2.468e+02 4.456e+02, threshold=4.189e+02, percent-clipped=2.0 2023-09-30 06:24:33,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 06:24:33,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:24:33,810 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:35,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:36,790 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 06:24:36,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:24:36,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:24:37,246 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=624106.6666666666, ans=0.2 2023-09-30 06:24:38,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:24:38,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:24:38,537 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 06:24:43,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:43,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:24:44,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:44,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 06:24:46,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 06:24:46,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:47,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:24:49,379 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 06:24:51,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 06:24:51,828 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.89 vs. limit=22.5 2023-09-30 06:24:52,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:24:54,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 06:24:56,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:24:59,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:24:59,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:24:59,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=624240.0, ans=0.125 2023-09-30 06:25:03,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:03,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:03,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:25:03,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:25:07,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:25:07,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:07,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:25:08,803 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 06:25:08,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 06:25:12,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:25:12,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:12,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:15,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:15,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:16,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:25:16,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:16,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:25:18,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:18,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=624306.6666666666, ans=0.125 2023-09-30 06:25:19,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:25:21,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 06:25:22,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:22,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:23,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:25:25,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:25:26,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:28,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:28,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:29,263 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=624373.3333333334, ans=0.125 2023-09-30 06:25:30,144 INFO [train.py:1039] (0/4) Epoch 18, batch 3350, loss[loss=0.2084, simple_loss=0.2673, pruned_loss=0.07472, over 22736.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2557, pruned_loss=0.053, over 4729174.22 frames. ], batch size: 322, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:25:33,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:25:35,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:35,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:25:39,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:40,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:25:44,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:44,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:25:45,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 06:25:47,216 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 06:25:47,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:49,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=624440.0, ans=0.125 2023-09-30 06:25:51,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 06:25:51,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 06:25:53,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:25:53,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:25:54,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:55,467 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.88 vs. limit=15.0 2023-09-30 06:25:56,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 06:25:56,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:56,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:25:59,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:01,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:02,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:03,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:26:03,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=624506.6666666666, ans=0.2 2023-09-30 06:26:04,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=624506.6666666666, ans=0.0 2023-09-30 06:26:06,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:09,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:09,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:13,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:26:15,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:17,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:18,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:21,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:23,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 06:26:24,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:26:24,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 06:26:24,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:26:27,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 06:26:27,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:29,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:31,791 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.73 vs. limit=15.0 2023-09-30 06:26:32,642 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=624573.3333333334, ans=0.1 2023-09-30 06:26:32,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=624573.3333333334, ans=0.125 2023-09-30 06:26:37,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:39,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 06:26:39,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:26:40,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:26:42,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:26:47,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:26:51,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 06:26:51,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:26:51,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:26:53,298 INFO [train.py:1039] (0/4) Epoch 18, batch 3400, loss[loss=0.1955, simple_loss=0.2625, pruned_loss=0.06421, over 22640.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2571, pruned_loss=0.05375, over 4728469.96 frames. ], batch size: 322, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:26:53,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:53,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 06:26:54,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:54,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 06:26:56,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:26:58,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:26:58,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 06:27:02,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 06:27:02,804 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 06:27:02,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:07,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:27:07,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:27:09,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:10,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:27:16,036 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.809e+02 2.054e+02 2.312e+02 3.383e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 06:27:16,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:17,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 06:27:22,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:27:24,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:24,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:27,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:27:35,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:27:39,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 06:27:45,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:47,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:48,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 06:27:48,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:27:50,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:50,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:50,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:27:53,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:58,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:28:00,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:28:04,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:06,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 06:28:11,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=624973.3333333334, ans=0.2 2023-09-30 06:28:11,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=624973.3333333334, ans=0.04949747468305833 2023-09-30 06:28:13,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:28:14,629 INFO [train.py:1039] (0/4) Epoch 18, batch 3450, loss[loss=0.1774, simple_loss=0.2607, pruned_loss=0.04707, over 24650.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2564, pruned_loss=0.05361, over 4712321.46 frames. ], batch size: 68, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:28:16,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 06:28:21,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 06:28:21,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:28:23,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:28:23,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 06:28:23,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:27,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:28:32,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:28:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:34,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:28:34,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:37,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:42,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 06:28:48,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 06:28:48,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:28:48,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:28:52,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:57,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 06:28:59,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:29:00,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=625173.3333333334, ans=0.125 2023-09-30 06:29:04,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:05,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:29:07,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:29:08,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:29:12,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 06:29:12,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:12,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:29:16,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:29:18,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 06:29:23,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:29:28,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:29:28,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:30,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=625306.6666666666, ans=0.0 2023-09-30 06:29:33,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:37,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:37,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:38,606 INFO [train.py:1039] (0/4) Epoch 18, batch 3500, loss[loss=0.173, simple_loss=0.2538, pruned_loss=0.04606, over 24540.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2551, pruned_loss=0.05282, over 4719686.15 frames. ], batch size: 71, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:29:38,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:29:40,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:41,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=625373.3333333334, ans=0.0 2023-09-30 06:29:45,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:47,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:29:49,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 06:29:50,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:29:53,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:29:55,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:55,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 06:30:01,128 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.444e+02 1.872e+02 2.058e+02 2.368e+02 3.255e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 06:30:01,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:30:01,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:30:03,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:30:03,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:03,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:30:03,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:05,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:05,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 06:30:09,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:10,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:30:12,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:14,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:16,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 06:30:16,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:16,253 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=625506.6666666666, ans=0.1 2023-09-30 06:30:17,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:20,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:30:22,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:24,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:30:24,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:25,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 06:30:27,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 06:30:27,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 06:30:29,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:30,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:30,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:30,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:30:33,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:30:34,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:30:38,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=625573.3333333334, ans=0.05 2023-09-30 06:30:41,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:30:42,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 06:30:42,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 06:30:42,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:30:46,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:30:46,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:46,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:49,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 06:30:50,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:53,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:53,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 06:30:56,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 06:30:59,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:59,667 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=625640.0, ans=0.0 2023-09-30 06:31:00,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:31:01,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:01,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:02,379 INFO [train.py:1039] (0/4) Epoch 18, batch 3550, loss[loss=0.1725, simple_loss=0.2581, pruned_loss=0.04344, over 24037.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2532, pruned_loss=0.05254, over 4708241.71 frames. ], batch size: 80, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:31:04,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:31:12,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:15,509 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.74 vs. limit=15.0 2023-09-30 06:31:16,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:31:17,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:18,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=625773.3333333334, ans=0.0 2023-09-30 06:31:19,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:31:20,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:22,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:31:22,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:31:27,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:27,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:31:27,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:29,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:31:29,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:31:36,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:31:36,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:38,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:38,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:38,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:31:38,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 06:31:38,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:40,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:40,443 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=625840.0, ans=0.0 2023-09-30 06:31:41,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:31:48,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:50,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:50,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:52,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 06:31:54,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:31:55,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 06:31:57,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:58,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:31:58,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:32:02,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 06:32:04,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:07,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:09,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 06:32:09,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:15,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:32:15,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=625973.3333333334, ans=0.125 2023-09-30 06:32:15,632 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.02 vs. limit=6.0 2023-09-30 06:32:16,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 06:32:16,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=625973.3333333334, ans=0.1 2023-09-30 06:32:23,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 06:32:23,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:32:24,963 INFO [train.py:1039] (0/4) Epoch 18, batch 3600, loss[loss=0.1838, simple_loss=0.27, pruned_loss=0.04881, over 24350.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2536, pruned_loss=0.05227, over 4710472.07 frames. ], batch size: 74, lr: 5.70e-03, grad_scale: 32.0 2023-09-30 06:32:25,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:32:27,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:28,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:30,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:32:35,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:35,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:37,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:32:37,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:32:38,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:38,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 06:32:42,218 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:32:43,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:32:43,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:46,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:47,985 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.865e+02 1.967e+02 2.260e+02 3.686e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 06:32:49,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:32:51,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:32:52,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:52,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 06:32:54,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:56,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:57,103 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=626173.3333333334, ans=0.125 2023-09-30 06:32:58,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:32:59,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:01,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:33:03,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:03,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 06:33:13,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:14,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:33:16,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 06:33:19,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=626240.0, ans=0.05 2023-09-30 06:33:21,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:33:23,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=626240.0, ans=0.0 2023-09-30 06:33:25,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:27,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:34,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:33:34,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:33:34,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 06:33:36,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 06:33:38,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 06:33:41,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:41,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:33:41,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=626306.6666666666, ans=0.125 2023-09-30 06:33:42,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 06:33:44,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:33:44,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:33:44,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:44,572 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=626306.6666666666, ans=0.125 2023-09-30 06:33:45,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 06:33:47,874 INFO [train.py:1039] (0/4) Epoch 18, batch 3650, loss[loss=0.2511, simple_loss=0.2991, pruned_loss=0.1015, over 18993.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2554, pruned_loss=0.05324, over 4713027.04 frames. ], batch size: 388, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:33:47,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 06:33:48,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=626373.3333333334, ans=0.125 2023-09-30 06:33:49,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:51,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 06:33:53,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=626373.3333333334, ans=0.125 2023-09-30 06:33:55,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 06:33:57,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:34:00,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 06:34:00,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=626373.3333333334, ans=0.0 2023-09-30 06:34:02,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 06:34:07,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:07,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:34:08,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:34:11,971 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=626440.0, ans=0.125 2023-09-30 06:34:13,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:34:13,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:34:14,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 06:34:14,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:34:14,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:14,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 06:34:17,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:34:19,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:34:19,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:20,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:34:24,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 06:34:24,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 06:34:25,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:34:28,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 06:34:28,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:28,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:34:36,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:34:37,316 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.36 vs. limit=15.0 2023-09-30 06:34:38,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:38,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:34:40,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:34:40,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:34:41,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:34:42,221 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=626573.3333333334, ans=0.125 2023-09-30 06:34:45,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:45,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:45,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:49,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:34:50,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:50,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:34:59,204 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 06:35:03,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:03,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:03,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:35:05,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:05,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:35:06,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:08,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 06:35:08,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:09,607 INFO [train.py:1039] (0/4) Epoch 18, batch 3700, loss[loss=0.1769, simple_loss=0.249, pruned_loss=0.05241, over 24448.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2556, pruned_loss=0.05315, over 4719081.61 frames. ], batch size: 58, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:35:11,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:35:14,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:35:14,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:35:18,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:18,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 06:35:18,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:20,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:35:20,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:35:24,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:35:28,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:28,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:29,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:35:29,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:31,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:35:34,543 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.984e+02 2.156e+02 2.490e+02 5.109e+02, threshold=4.311e+02, percent-clipped=1.0 2023-09-30 06:35:34,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:34,877 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 06:35:42,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:35:42,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:35:45,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:35:45,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 06:35:45,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:35:50,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:52,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 06:35:54,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:54,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:35:57,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:57,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:35:59,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:35:59,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=626906.6666666666, ans=0.1 2023-09-30 06:36:04,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:06,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 06:36:06,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:06,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 06:36:10,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:36:12,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:36:14,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:15,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 06:36:17,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:36:17,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:36:18,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:18,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:21,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:23,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 06:36:25,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 06:36:25,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:36:25,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:27,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:36:28,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:36:30,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:36:31,895 INFO [train.py:1039] (0/4) Epoch 18, batch 3750, loss[loss=0.1946, simple_loss=0.2683, pruned_loss=0.06042, over 23533.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2573, pruned_loss=0.05402, over 4711768.92 frames. ], batch size: 106, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:36:32,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:36:33,547 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-09-30 06:36:34,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:36:35,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 06:36:37,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:36:40,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:36:41,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 06:36:42,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:36:43,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:45,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:46,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:36:50,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:51,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:53,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:36:57,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:37:00,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:00,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 06:37:01,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:03,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:04,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:37:08,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 06:37:11,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 06:37:13,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:15,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:16,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:18,747 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.10 vs. limit=15.0 2023-09-30 06:37:20,058 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=627240.0, ans=0.0 2023-09-30 06:37:21,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:24,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:37:28,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 06:37:29,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=627240.0, ans=0.1 2023-09-30 06:37:31,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:34,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:37:34,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:37:39,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:37:44,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:37:45,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:37:48,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:37:49,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:37:50,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=627306.6666666666, ans=0.125 2023-09-30 06:37:50,479 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.44 vs. limit=22.5 2023-09-30 06:37:51,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:37:54,267 INFO [train.py:1039] (0/4) Epoch 18, batch 3800, loss[loss=0.2044, simple_loss=0.2595, pruned_loss=0.07459, over 19642.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2568, pruned_loss=0.05367, over 4717362.09 frames. ], batch size: 388, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:37:59,252 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:38:03,825 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.89 vs. limit=15.0 2023-09-30 06:38:04,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:06,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:38:06,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 06:38:07,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:10,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:10,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:38:13,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 06:38:13,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:14,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:38:16,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:16,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:38:17,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:17,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 06:38:19,113 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.827e+02 2.039e+02 2.367e+02 3.749e+02, threshold=4.078e+02, percent-clipped=0.0 2023-09-30 06:38:20,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 06:38:22,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:38:25,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:27,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:38:27,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=627506.6666666666, ans=0.125 2023-09-30 06:38:28,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:38:30,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:38:30,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:32,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:33,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:40,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:38:40,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 06:38:42,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:38:44,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=627573.3333333334, ans=0.0 2023-09-30 06:38:48,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:38:55,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:38:57,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 06:39:00,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 06:39:01,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:03,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:39:05,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:05,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 06:39:08,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 06:39:08,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 06:39:08,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:11,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:39:16,692 INFO [train.py:1039] (0/4) Epoch 18, batch 3850, loss[loss=0.1765, simple_loss=0.2292, pruned_loss=0.06186, over 22714.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2558, pruned_loss=0.05336, over 4718631.37 frames. ], batch size: 322, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:39:18,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:39:18,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:39:23,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:39:23,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=627706.6666666666, ans=0.125 2023-09-30 06:39:24,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 06:39:24,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:39:26,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:29,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:39:31,575 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=627773.3333333334, ans=0.2 2023-09-30 06:39:32,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:35,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:39:37,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 06:39:42,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:45,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:47,219 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=627840.0, ans=0.125 2023-09-30 06:39:49,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:39:49,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:39:52,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:52,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:39:53,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:53,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:39:53,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:56,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:57,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:58,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:39:58,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 06:39:58,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 06:40:01,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:01,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:05,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 06:40:08,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 06:40:10,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:10,922 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.96 vs. limit=15.0 2023-09-30 06:40:11,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 06:40:14,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:40:19,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:21,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:25,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:25,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 06:40:28,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 06:40:30,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:31,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:34,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:40:34,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:40:35,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:40:37,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 06:40:37,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:38,624 INFO [train.py:1039] (0/4) Epoch 18, batch 3900, loss[loss=0.158, simple_loss=0.2049, pruned_loss=0.05556, over 19126.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2543, pruned_loss=0.05302, over 4708516.80 frames. ], batch size: 388, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:40:38,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 06:40:38,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:38,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:40,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:40:41,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:43,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:40:43,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:43,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:45,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:40:45,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 06:40:45,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:48,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:50,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:50,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:40:52,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:53,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:53,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:55,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:40:57,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 06:40:57,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:40:58,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 06:41:00,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:41:00,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 06:41:01,254 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.20 vs. limit=15.0 2023-09-30 06:41:02,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 06:41:03,821 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.888e+02 2.025e+02 2.251e+02 3.863e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 06:41:08,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:09,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:41:09,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:41:10,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:13,928 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=628173.3333333334, ans=0.1 2023-09-30 06:41:16,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:18,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:41:21,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:41:21,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:23,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:41:26,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=628240.0, ans=0.0 2023-09-30 06:41:28,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:41:28,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:41:30,831 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.17 vs. limit=10.0 2023-09-30 06:41:37,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:41:38,458 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.43 vs. limit=15.0 2023-09-30 06:41:39,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:41:49,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:41:51,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:51,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 06:41:52,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 06:41:52,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:54,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 06:41:56,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:57,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 06:41:58,170 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=628306.6666666666, ans=0.125 2023-09-30 06:42:00,965 INFO [train.py:1039] (0/4) Epoch 18, batch 3950, loss[loss=0.1695, simple_loss=0.2557, pruned_loss=0.04167, over 24481.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2539, pruned_loss=0.0525, over 4710806.40 frames. ], batch size: 66, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:42:04,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:42:06,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 06:42:06,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:42:09,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:42:09,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:42:16,487 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 06:42:17,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:18,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 06:42:19,458 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 06:42:19,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:42:23,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:23,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:42:23,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:26,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 06:42:28,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:42:28,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:28,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:42:30,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:42:31,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:42:44,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:42:44,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:42:49,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 06:42:54,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 06:42:54,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 06:42:54,864 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=628573.3333333334, ans=0.0 2023-09-30 06:42:55,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:42:56,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:43:04,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:43:05,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:43:05,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:06,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:43:06,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 06:43:11,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:43:12,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:43:17,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 06:43:18,562 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.66 vs. limit=22.5 2023-09-30 06:43:24,662 INFO [train.py:1039] (0/4) Epoch 18, batch 4000, loss[loss=0.189, simple_loss=0.2609, pruned_loss=0.05854, over 23252.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2543, pruned_loss=0.05255, over 4716915.36 frames. ], batch size: 119, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:43:27,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:37,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:41,389 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=628773.3333333334, ans=0.2 2023-09-30 06:43:42,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:43,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:43:44,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:44,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 06:43:44,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:43:45,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 06:43:45,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:43:45,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 06:43:47,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:47,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.92 vs. limit=22.5 2023-09-30 06:43:48,584 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.924e+02 2.193e+02 2.601e+02 4.615e+02, threshold=4.387e+02, percent-clipped=1.0 2023-09-30 06:43:50,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:43:52,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:43:52,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:43:52,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:52,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:43:54,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:43:56,084 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 06:43:57,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:43:57,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:43:59,447 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 06:44:00,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:44:01,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:12,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 06:44:12,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:44:14,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:44:15,907 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 06:44:16,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:44:17,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=628906.6666666666, ans=0.0 2023-09-30 06:44:18,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 06:44:18,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:44:18,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:19,542 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.38 vs. limit=15.0 2023-09-30 06:44:20,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:44:21,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:44:21,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:44:21,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:23,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 06:44:25,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:26,615 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 06:44:33,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:44:36,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:44:37,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:44:37,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:39,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:44:39,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:44:43,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:45,897 INFO [train.py:1039] (0/4) Epoch 18, batch 4050, loss[loss=0.1764, simple_loss=0.2447, pruned_loss=0.05409, over 23689.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2553, pruned_loss=0.05348, over 4698520.41 frames. ], batch size: 149, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:44:48,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:44:48,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 06:44:51,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:44:51,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:44:52,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:44:54,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:44:55,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:00,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:02,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:03,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:45:07,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:45:07,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:45:11,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:13,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:45:16,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 06:45:20,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 06:45:20,158 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 06:45:20,355 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=629173.3333333334, ans=0.0 2023-09-30 06:45:22,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:45:31,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 06:45:31,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:45:31,366 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=629173.3333333334, ans=0.0 2023-09-30 06:45:34,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:37,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:37,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:45:37,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:41,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:44,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 06:45:44,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:45:46,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:45:47,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 06:45:52,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:46:01,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 06:46:01,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:01,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:46:04,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 06:46:04,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 06:46:04,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:07,713 INFO [train.py:1039] (0/4) Epoch 18, batch 4100, loss[loss=0.1916, simple_loss=0.2772, pruned_loss=0.05303, over 24563.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2558, pruned_loss=0.05345, over 4688977.01 frames. ], batch size: 71, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:46:07,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:11,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:11,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:46:18,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 06:46:19,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 06:46:21,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 06:46:22,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 06:46:22,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:24,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:46:24,386 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 06:46:27,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:27,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:46:27,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:29,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:46:33,104 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.931e+02 2.115e+02 2.277e+02 3.051e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-30 06:46:34,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:46:36,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:36,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:46:36,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 06:46:37,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:37,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:46:37,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:39,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:46:40,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 06:46:44,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:46:47,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 06:46:48,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:52,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:52,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 06:46:52,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:54,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:46:54,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:46:55,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 06:46:57,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:46:58,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:47:00,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 06:47:00,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:00,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:02,354 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=629573.3333333334, ans=0.1 2023-09-30 06:47:05,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:09,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:12,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:14,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:47:18,468 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=629640.0, ans=0.125 2023-09-30 06:47:23,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:23,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:25,066 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=629640.0, ans=0.125 2023-09-30 06:47:26,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:29,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:47:30,943 INFO [train.py:1039] (0/4) Epoch 18, batch 4150, loss[loss=0.1616, simple_loss=0.2347, pruned_loss=0.04428, over 24352.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.256, pruned_loss=0.05352, over 4697352.33 frames. ], batch size: 56, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:47:32,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:34,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:47:35,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:47:35,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:47:38,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 06:47:38,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:38,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 06:47:41,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 06:47:41,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 06:47:42,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:44,740 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=629706.6666666666, ans=0.2 2023-09-30 06:47:47,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:47:47,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:52,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:54,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:47:56,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:47:59,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:47:59,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:48:00,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:48:00,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=629773.3333333334, ans=0.1 2023-09-30 06:48:02,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:05,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:05,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 06:48:09,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 06:48:09,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:48:11,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 06:48:11,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:48:11,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:15,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:17,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:20,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 06:48:23,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:25,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:48:26,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 06:48:29,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:30,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 06:48:33,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:48:33,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:35,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:36,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 06:48:36,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:48:36,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:48:37,255 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.93 vs. limit=15.0 2023-09-30 06:48:38,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:48:41,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 06:48:41,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:41,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:48:41,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:48:42,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 06:48:43,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:43,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:48:44,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:46,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:46,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 06:48:46,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:48,574 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=629973.3333333334, ans=0.2 2023-09-30 06:48:53,164 INFO [train.py:1039] (0/4) Epoch 18, batch 4200, loss[loss=0.1659, simple_loss=0.2207, pruned_loss=0.05556, over 22635.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2547, pruned_loss=0.05312, over 4706319.68 frames. ], batch size: 322, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:48:53,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:48:54,046 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.68 vs. limit=15.0 2023-09-30 06:48:56,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 06:48:56,542 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:48:56,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=630040.0, ans=0.07 2023-09-30 06:49:00,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:02,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:49:03,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:03,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:05,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 06:49:05,836 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=630040.0, ans=0.125 2023-09-30 06:49:07,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 06:49:07,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:08,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:11,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:49:13,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:49:14,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=630106.6666666666, ans=0.125 2023-09-30 06:49:16,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:17,442 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.914e+02 2.122e+02 2.477e+02 4.078e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 06:49:17,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:17,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 06:49:17,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:19,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:19,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:20,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:49:21,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:49:25,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 06:49:26,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:29,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:49:31,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:49:33,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:49:34,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:49:37,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:49:37,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 06:49:37,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:49:38,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:49:38,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=630173.3333333334, ans=0.1 2023-09-30 06:49:44,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:49:46,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:51,586 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.54 vs. limit=15.0 2023-09-30 06:49:52,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:49:55,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 06:49:58,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:49:59,993 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=630306.6666666666, ans=0.0 2023-09-30 06:50:04,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:50:04,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:06,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 06:50:11,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:50:14,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:50:15,764 INFO [train.py:1039] (0/4) Epoch 18, batch 4250, loss[loss=0.1763, simple_loss=0.2198, pruned_loss=0.06639, over 18946.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2532, pruned_loss=0.05282, over 4708914.64 frames. ], batch size: 388, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:50:15,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:50:16,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=630373.3333333334, ans=0.125 2023-09-30 06:50:18,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:25,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:50:25,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 06:50:26,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:50:28,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:32,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:36,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:36,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:37,742 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.17 vs. limit=22.5 2023-09-30 06:50:39,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:50:39,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:50:40,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:42,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:42,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:44,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:50:45,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:50:47,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 06:50:49,540 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.38 vs. limit=15.0 2023-09-30 06:50:50,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 06:50:51,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:53,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:53,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:53,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:50:53,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:54,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:58,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:50:59,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:51:00,844 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.20 vs. limit=15.0 2023-09-30 06:51:03,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:03,346 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=630573.3333333334, ans=0.035 2023-09-30 06:51:04,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:06,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 06:51:06,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:51:08,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 06:51:10,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:51:12,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:51:15,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:15,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:51:18,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 06:51:20,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:51:21,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:51:26,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:26,554 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=630640.0, ans=0.125 2023-09-30 06:51:29,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:30,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:51:32,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:32,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:34,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:51:36,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:51:36,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 06:51:37,586 INFO [train.py:1039] (0/4) Epoch 18, batch 4300, loss[loss=0.1855, simple_loss=0.2571, pruned_loss=0.05691, over 22793.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2528, pruned_loss=0.05208, over 4715483.84 frames. ], batch size: 322, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:51:37,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:42,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:42,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:51:47,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:55,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=630773.3333333334, ans=0.1 2023-09-30 06:51:56,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:56,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 06:51:56,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:51:59,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:51:59,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:51:59,302 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 06:52:02,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:52:03,681 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.954e+02 2.321e+02 2.799e+02 4.498e+02, threshold=4.642e+02, percent-clipped=1.0 2023-09-30 06:52:05,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:07,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 06:52:07,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:52:09,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 06:52:10,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:52:12,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:52:13,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:52:13,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:52:15,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:52:17,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:19,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:52:19,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 06:52:20,027 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=630840.0, ans=0.0 2023-09-30 06:52:21,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 06:52:23,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:52:26,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:26,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:52:26,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:26,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=630906.6666666666, ans=0.125 2023-09-30 06:52:28,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:28,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 06:52:28,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 06:52:29,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 06:52:29,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:52:29,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 06:52:31,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 06:52:34,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:35,912 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 06:52:38,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:52:39,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=630906.6666666666, ans=0.2 2023-09-30 06:52:39,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=630906.6666666666, ans=0.125 2023-09-30 06:52:40,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:40,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:44,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 06:52:44,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:44,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:45,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:52:45,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:52:45,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:52:47,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:52:50,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=630973.3333333334, ans=0.125 2023-09-30 06:52:52,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:52,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:53,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:53:00,692 INFO [train.py:1039] (0/4) Epoch 18, batch 4350, loss[loss=0.1829, simple_loss=0.2617, pruned_loss=0.05212, over 23917.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2536, pruned_loss=0.05222, over 4723074.93 frames. ], batch size: 86, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:53:00,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 06:53:02,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:53:02,527 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=631040.0, ans=0.1 2023-09-30 06:53:05,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:10,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:12,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:53:12,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:53:16,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:53:20,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:23,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:53:23,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:53:26,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:53:28,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:53:30,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:53:30,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=631106.6666666666, ans=0.1 2023-09-30 06:53:36,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 06:53:37,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:39,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:44,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:44,355 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=631173.3333333334, ans=0.0 2023-09-30 06:53:44,644 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.35 vs. limit=15.0 2023-09-30 06:53:47,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 06:53:49,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:53:50,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:53:56,821 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 06:53:58,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:53:58,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:53:59,917 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 06:54:01,822 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 06:54:01,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:03,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:04,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:54:04,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:06,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:09,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:10,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 06:54:10,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:10,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:12,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:12,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 06:54:13,634 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 06:54:13,641 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 06:54:13,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 06:54:18,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:54:18,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:54:18,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:19,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:54:21,160 INFO [train.py:1039] (0/4) Epoch 18, batch 4400, loss[loss=0.1672, simple_loss=0.2456, pruned_loss=0.04441, over 24644.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2541, pruned_loss=0.05224, over 4729358.96 frames. ], batch size: 65, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:54:21,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 06:54:22,848 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 06:54:22,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:23,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=631373.3333333334, ans=0.0 2023-09-30 06:54:26,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:26,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:29,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:29,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=631373.3333333334, ans=0.04949747468305833 2023-09-30 06:54:32,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 06:54:32,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 06:54:32,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 06:54:33,981 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 06:54:34,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:54:34,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:37,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 06:54:39,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:40,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:40,886 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 06:54:46,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:46,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 06:54:47,800 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 06:54:49,101 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.990e+02 2.241e+02 2.697e+02 4.171e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-30 06:54:50,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 06:54:50,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 06:54:52,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 06:54:52,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:52,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:54,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:55,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:56,228 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.83 vs. limit=22.5 2023-09-30 06:54:57,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 06:54:57,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 06:54:58,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:00,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:55:00,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:02,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:03,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:03,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 06:55:05,278 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 06:55:07,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:15,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:55:16,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 06:55:19,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:55:21,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:21,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=631573.3333333334, ans=0.1 2023-09-30 06:55:25,418 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.78 vs. limit=15.0 2023-09-30 06:55:25,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:55:25,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 06:55:25,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:55:27,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:55:27,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:55:28,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:55:33,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 06:55:36,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 06:55:38,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 06:55:38,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:38,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 06:55:40,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:55:41,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:55:43,113 INFO [train.py:1039] (0/4) Epoch 18, batch 4450, loss[loss=0.2354, simple_loss=0.2922, pruned_loss=0.08936, over 19729.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2548, pruned_loss=0.05208, over 4709608.79 frames. ], batch size: 388, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:55:43,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 06:55:48,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:50,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:50,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:55:57,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:55:57,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:56:00,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=631773.3333333334, ans=0.09899494936611666 2023-09-30 06:56:01,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:03,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:56:04,502 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.81 vs. limit=15.0 2023-09-30 06:56:05,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:56:05,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:07,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 06:56:07,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:08,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:10,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:10,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:56:13,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:56:17,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:19,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:21,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:21,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:23,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=631840.0, ans=0.0 2023-09-30 06:56:23,062 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=631840.0, ans=0.125 2023-09-30 06:56:24,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:56:27,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:56:29,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 06:56:31,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 06:56:31,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:56:34,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:34,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 06:56:36,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=631906.6666666666, ans=0.125 2023-09-30 06:56:39,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:56:42,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:44,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 06:56:44,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:44,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:56:44,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:56:44,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:47,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:50,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:56:51,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 06:56:53,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:56:57,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:57,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:57:00,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:00,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:57:03,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:57:05,027 INFO [train.py:1039] (0/4) Epoch 18, batch 4500, loss[loss=0.1654, simple_loss=0.2282, pruned_loss=0.05126, over 23383.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.256, pruned_loss=0.05269, over 4716792.44 frames. ], batch size: 285, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:57:05,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 06:57:08,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:57:14,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:14,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=632040.0, ans=10.0 2023-09-30 06:57:14,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=632040.0, ans=0.0 2023-09-30 06:57:15,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 06:57:15,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 06:57:17,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:20,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:22,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:22,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:57:23,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:57:24,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:25,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:28,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=632106.6666666666, ans=0.125 2023-09-30 06:57:33,296 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.852e+02 2.175e+02 2.491e+02 3.622e+02, threshold=4.350e+02, percent-clipped=0.0 2023-09-30 06:57:38,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:38,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:57:42,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:57:42,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:57:44,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:57:51,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:57:55,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:57:58,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:57:59,758 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=8.55 vs. limit=22.5 2023-09-30 06:58:01,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:58:01,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 06:58:02,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:02,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:03,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:05,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:58:05,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=632240.0, ans=0.125 2023-09-30 06:58:08,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:58:08,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 06:58:08,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:58:08,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:15,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:58:16,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:58:17,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:21,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:58:21,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:58:24,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 06:58:25,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 06:58:25,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 06:58:28,736 INFO [train.py:1039] (0/4) Epoch 18, batch 4550, loss[loss=0.16, simple_loss=0.2374, pruned_loss=0.04131, over 24462.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.255, pruned_loss=0.05276, over 4715929.34 frames. ], batch size: 63, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:58:28,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 06:58:32,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 06:58:32,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:36,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:36,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:40,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:40,489 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=632373.3333333334, ans=0.125 2023-09-30 06:58:44,253 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=632440.0, ans=0.125 2023-09-30 06:58:45,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:58:47,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:48,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:58:50,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:58:50,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:52,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:52,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=632440.0, ans=0.125 2023-09-30 06:58:54,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:56,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:58:57,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 06:58:59,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 06:58:59,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:59:00,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 06:59:01,846 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.24 vs. limit=15.0 2023-09-30 06:59:03,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 06:59:04,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:05,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 06:59:07,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:59:11,634 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.42 vs. limit=6.0 2023-09-30 06:59:12,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:12,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:12,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:59:15,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 06:59:15,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=632506.6666666666, ans=0.1 2023-09-30 06:59:18,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:19,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=632573.3333333334, ans=0.0 2023-09-30 06:59:21,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:21,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:24,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:26,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 06:59:28,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 06:59:28,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:59:28,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 06:59:33,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 06:59:33,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:33,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:34,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:59:34,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:36,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:59:37,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:59:38,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 06:59:39,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:39,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 06:59:41,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 06:59:41,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:59:41,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 06:59:46,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:59:46,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:59:47,064 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.45 vs. limit=15.0 2023-09-30 06:59:47,111 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.26 vs. limit=22.5 2023-09-30 06:59:48,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:59:48,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:50,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:59:50,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:59:51,590 INFO [train.py:1039] (0/4) Epoch 18, batch 4600, loss[loss=0.181, simple_loss=0.2695, pruned_loss=0.04626, over 24640.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2541, pruned_loss=0.05221, over 4716717.50 frames. ], batch size: 68, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:59:53,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:59:54,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:56,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:00:01,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:00:01,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:00:02,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:03,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 07:00:05,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:00:08,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:00:10,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:11,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:12,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=632773.3333333334, ans=0.1 2023-09-30 07:00:18,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 07:00:18,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:20,149 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.823e+02 2.071e+02 2.370e+02 3.584e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 07:00:21,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:26,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:00:26,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:31,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 07:00:31,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:00:33,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:00:40,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:40,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:00:42,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:00:46,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 07:00:48,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:00:53,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:54,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:00:55,237 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=632906.6666666666, ans=0.125 2023-09-30 07:00:58,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:58,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 07:00:59,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:59,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 07:00:59,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:59,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:01,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:02,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:02,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:03,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 07:01:03,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 07:01:03,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 07:01:03,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:03,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:05,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:05,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:15,439 INFO [train.py:1039] (0/4) Epoch 18, batch 4650, loss[loss=0.1826, simple_loss=0.2557, pruned_loss=0.05473, over 23388.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2532, pruned_loss=0.05248, over 4702102.56 frames. ], batch size: 105, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:01:17,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:01:20,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:20,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:20,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:01:20,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:20,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:22,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:25,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 07:01:30,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:01:32,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 07:01:33,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:35,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 07:01:35,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:01:35,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 07:01:37,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 07:01:37,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:37,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:01:40,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:01:43,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:43,115 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 07:01:43,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=633106.6666666666, ans=0.0 2023-09-30 07:01:43,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=633106.6666666666, ans=0.0 2023-09-30 07:01:45,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:46,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=633106.6666666666, ans=0.0 2023-09-30 07:01:47,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 07:01:49,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=633173.3333333334, ans=0.0 2023-09-30 07:01:50,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:50,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:01:52,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 07:01:52,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:52,590 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=633173.3333333334, ans=0.2 2023-09-30 07:01:55,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:01:59,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:03,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:06,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:02:06,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=633240.0, ans=0.125 2023-09-30 07:02:09,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 07:02:09,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 07:02:11,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 07:02:11,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 07:02:14,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:21,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:02:21,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:21,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 07:02:21,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:22,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:22,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:02:24,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:02:25,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:02:25,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:25,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:30,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:30,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:02:30,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:02:32,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:02:32,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:02:34,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 07:02:39,274 INFO [train.py:1039] (0/4) Epoch 18, batch 4700, loss[loss=0.1768, simple_loss=0.2556, pruned_loss=0.04896, over 24587.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2546, pruned_loss=0.05294, over 4698007.08 frames. ], batch size: 60, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:02:43,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:44,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=633373.3333333334, ans=0.0 2023-09-30 07:02:45,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:47,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:02:49,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:49,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:02:54,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 07:02:54,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 07:02:58,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:58,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:02:59,055 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=633440.0, ans=0.1 2023-09-30 07:03:00,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:03:05,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:06,746 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.845e+02 2.033e+02 2.292e+02 3.478e+02, threshold=4.067e+02, percent-clipped=0.0 2023-09-30 07:03:13,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:03:15,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 07:03:18,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:03:24,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 07:03:24,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:03:26,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:31,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 07:03:33,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:03:39,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:03:39,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 07:03:41,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:41,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:43,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:44,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:03:44,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 07:03:45,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=633640.0, ans=0.1 2023-09-30 07:03:46,424 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 07:03:48,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:48,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 07:03:50,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:54,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 07:03:55,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=633640.0, ans=0.1 2023-09-30 07:03:57,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:03:59,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:01,455 INFO [train.py:1039] (0/4) Epoch 18, batch 4750, loss[loss=0.1838, simple_loss=0.2556, pruned_loss=0.05599, over 23620.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2555, pruned_loss=0.05302, over 4704750.52 frames. ], batch size: 134, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:04:03,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:03,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:04:05,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 07:04:05,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:07,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=633706.6666666666, ans=0.125 2023-09-30 07:04:09,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 07:04:10,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:04:11,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:04:12,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:16,840 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=633773.3333333334, ans=0.125 2023-09-30 07:04:20,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 07:04:24,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:04:27,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 07:04:27,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:30,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:33,042 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 07:04:33,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 07:04:34,899 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:04:39,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 07:04:41,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:44,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:04:46,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:04:46,037 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 07:04:46,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:04:49,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:04:50,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:04:54,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 07:04:54,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 07:04:54,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:55,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:04:55,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:58,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:04:58,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 07:04:59,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 07:05:04,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:09,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:05:09,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 07:05:10,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:12,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:14,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:05:14,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:15,684 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.99 vs. limit=15.0 2023-09-30 07:05:16,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:05:20,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:20,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 07:05:22,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 07:05:23,731 INFO [train.py:1039] (0/4) Epoch 18, batch 4800, loss[loss=0.1578, simple_loss=0.235, pruned_loss=0.04027, over 24287.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2556, pruned_loss=0.05315, over 4710301.12 frames. ], batch size: 61, lr: 5.67e-03, grad_scale: 32.0 2023-09-30 07:05:23,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 07:05:25,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:05:25,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:27,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 07:05:33,738 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:33,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:37,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:05:39,941 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.90 vs. limit=10.0 2023-09-30 07:05:40,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:40,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:42,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 07:05:42,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:42,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:05:45,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:05:50,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:05:51,855 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.888e+02 2.165e+02 2.522e+02 3.456e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 07:05:54,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:54,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:05:55,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:55,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 07:05:55,848 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:55,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:59,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:02,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:06:05,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:06:07,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:10,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 07:06:10,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 07:06:12,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:12,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:06:12,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:06:12,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:12,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:06:15,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:06:15,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:20,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:22,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:24,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:29,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 07:06:29,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:29,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:30,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:06:30,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:35,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:36,237 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=15.0 2023-09-30 07:06:36,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:06:36,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:36,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:06:38,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:06:38,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:06:42,301 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=634306.6666666666, ans=0.0 2023-09-30 07:06:43,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:43,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:43,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:46,855 INFO [train.py:1039] (0/4) Epoch 18, batch 4850, loss[loss=0.1825, simple_loss=0.2486, pruned_loss=0.05818, over 23649.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2569, pruned_loss=0.05356, over 4713037.71 frames. ], batch size: 149, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:06:47,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 07:06:47,688 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.66 vs. limit=15.0 2023-09-30 07:06:48,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 07:06:48,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:48,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:50,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:06:50,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:51,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:07:01,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 07:07:01,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:02,112 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=634440.0, ans=0.0 2023-09-30 07:07:07,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:08,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:07:08,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:11,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:13,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:07:14,933 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:07:16,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:07:16,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 07:07:18,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:07:21,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:07:21,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:07:23,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:07:23,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 07:07:25,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:25,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:30,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:30,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 07:07:31,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 07:07:33,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:07:40,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:07:41,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 07:07:41,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:07:43,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:07:43,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:07:44,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 07:07:44,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:47,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 07:07:47,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:49,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:07:51,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 07:07:58,740 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:08:01,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:08,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:08:08,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:09,739 INFO [train.py:1039] (0/4) Epoch 18, batch 4900, loss[loss=0.1887, simple_loss=0.2698, pruned_loss=0.05376, over 23931.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2557, pruned_loss=0.05257, over 4710669.35 frames. ], batch size: 80, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:08:13,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 07:08:13,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:08:18,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:19,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:21,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:08:23,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 07:08:23,696 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=634706.6666666666, ans=0.125 2023-09-30 07:08:28,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 07:08:34,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 07:08:34,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 07:08:35,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:35,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:35,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:08:36,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:36,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:08:37,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 07:08:39,898 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.805e+02 1.986e+02 2.156e+02 3.448e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 07:08:40,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 07:08:41,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:08:43,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:08:43,252 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=634840.0, ans=0.0 2023-09-30 07:08:44,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:47,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:08:49,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:51,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:51,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 07:08:52,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:08:55,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:55,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 07:08:55,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 07:08:58,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 07:08:58,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=634906.6666666666, ans=0.125 2023-09-30 07:09:00,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:09:01,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:01,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:09:03,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:03,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:09:03,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:09:04,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 07:09:07,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:09,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:09:11,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:09:11,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=634906.6666666666, ans=0.125 2023-09-30 07:09:14,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 07:09:16,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:09:17,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:09:17,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 07:09:24,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:25,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:09:27,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 07:09:27,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:27,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:09:28,294 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.49 vs. limit=6.0 2023-09-30 07:09:29,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:30,234 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=634973.3333333334, ans=15.0 2023-09-30 07:09:32,647 INFO [train.py:1039] (0/4) Epoch 18, batch 4950, loss[loss=0.1675, simple_loss=0.2553, pruned_loss=0.03991, over 24452.00 frames. ], tot_loss[loss=0.179, simple_loss=0.254, pruned_loss=0.05205, over 4702155.08 frames. ], batch size: 69, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:09:32,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:32,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:09:32,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:32,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 07:09:34,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:09:37,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:37,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:38,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=635040.0, ans=0.125 2023-09-30 07:09:41,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 07:09:41,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 07:09:41,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:09:42,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 07:09:42,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:42,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:44,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:09:44,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:09:46,145 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=635040.0, ans=0.125 2023-09-30 07:09:48,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:48,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:09:49,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:09:51,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:52,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:52,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:56,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:10:03,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:03,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=635106.6666666666, ans=0.2 2023-09-30 07:10:05,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:10:07,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:08,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:10,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:10:11,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 07:10:13,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 07:10:14,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:17,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:10:17,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:10:19,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:10:19,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:10:20,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:10:21,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:22,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:10:25,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:10:26,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:26,926 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=635240.0, ans=0.1 2023-09-30 07:10:27,442 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.79 vs. limit=15.0 2023-09-30 07:10:28,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:28,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 07:10:28,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:10:29,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:10:35,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:10:36,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:10:36,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:10:38,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:38,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:10:38,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:10:40,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:10:41,147 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.31 vs. limit=15.0 2023-09-30 07:10:41,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:10:42,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:43,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 07:10:46,176 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.70 vs. limit=15.0 2023-09-30 07:10:46,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:10:48,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=635306.6666666666, ans=0.2 2023-09-30 07:10:51,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 07:10:51,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:10:55,498 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:10:56,580 INFO [train.py:1039] (0/4) Epoch 18, batch 5000, loss[loss=0.1735, simple_loss=0.2573, pruned_loss=0.04491, over 24580.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2542, pruned_loss=0.05179, over 4710742.82 frames. ], batch size: 71, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:10:58,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:58,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:00,087 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=635373.3333333334, ans=0.125 2023-09-30 07:11:00,106 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:11:01,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 07:11:01,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 07:11:03,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:04,519 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.76 vs. limit=12.0 2023-09-30 07:11:05,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 07:11:07,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:11:07,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:11:08,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 07:11:08,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:10,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:11,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 07:11:11,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:11,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:13,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 07:11:15,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 07:11:16,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:11:16,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 07:11:16,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:11:18,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:18,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:11:18,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 07:11:18,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 07:11:18,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=635440.0, ans=0.0 2023-09-30 07:11:19,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 07:11:21,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:22,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:22,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 07:11:22,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:24,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:26,589 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.857e+02 2.111e+02 2.507e+02 3.855e+02, threshold=4.222e+02, percent-clipped=0.0 2023-09-30 07:11:26,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:28,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:11:29,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 07:11:31,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:11:32,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:11:36,006 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 07:11:40,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:41,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:41,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:11:43,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 07:11:44,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:45,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:45,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:11:47,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 07:11:47,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:50,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:51,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:55,108 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=635573.3333333334, ans=0.04949747468305833 2023-09-30 07:11:56,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 07:12:01,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:12,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:12:13,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:15,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:12:15,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:15,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:12:15,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:12:16,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:20,226 INFO [train.py:1039] (0/4) Epoch 18, batch 5050, loss[loss=0.1836, simple_loss=0.2613, pruned_loss=0.05296, over 23758.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2549, pruned_loss=0.0522, over 4702914.67 frames. ], batch size: 149, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:12:21,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:23,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 07:12:24,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:12:26,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:27,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:12:28,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 07:12:29,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:29,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:12:32,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:12:35,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:12:35,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:12:37,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=635773.3333333334, ans=0.0 2023-09-30 07:12:42,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=635773.3333333334, ans=0.125 2023-09-30 07:12:45,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 07:12:45,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:12:47,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:12:47,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 07:12:47,432 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=635773.3333333334, ans=0.0 2023-09-30 07:12:48,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:12:49,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:50,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:52,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:12:52,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 07:12:52,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 07:12:54,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:58,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:12:59,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:59,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 07:13:01,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:01,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=635840.0, ans=0.1 2023-09-30 07:13:02,135 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-09-30 07:13:04,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 07:13:05,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:13:05,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:13:07,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:08,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:13:12,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:15,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:13:16,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:16,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:13:16,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:13:16,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 07:13:17,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:13:18,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:13:23,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:23,165 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 07:13:23,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:13:26,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:28,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:28,204 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 07:13:30,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:30,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 07:13:30,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:35,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:35,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:35,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 07:13:37,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 07:13:40,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:40,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:13:40,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:13:41,693 INFO [train.py:1039] (0/4) Epoch 18, batch 5100, loss[loss=0.1834, simple_loss=0.2566, pruned_loss=0.05513, over 23376.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2556, pruned_loss=0.05243, over 4714176.42 frames. ], batch size: 93, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:13:43,453 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 07:13:44,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:50,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 07:13:50,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 07:13:51,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:53,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:54,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:56,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 07:13:56,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 07:13:57,088 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.04 vs. limit=6.0 2023-09-30 07:14:01,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:14:01,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:14:07,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:14:08,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 07:14:10,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:10,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:14:10,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 07:14:11,821 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.826e+02 2.022e+02 2.261e+02 3.082e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 07:14:14,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 07:14:16,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=636173.3333333334, ans=0.125 2023-09-30 07:14:17,835 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 07:14:19,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:19,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 07:14:19,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 07:14:24,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:32,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:14:36,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 07:14:36,124 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 07:14:36,136 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 07:14:36,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=636240.0, ans=0.125 2023-09-30 07:14:37,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 07:14:37,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:40,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 07:14:43,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=636240.0, ans=0.2 2023-09-30 07:14:44,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 07:14:47,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:14:49,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:14:52,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 07:14:53,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:14:53,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 07:15:00,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:15:00,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:15:00,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:15:01,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:15:02,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:15:02,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:15:03,679 INFO [train.py:1039] (0/4) Epoch 18, batch 5150, loss[loss=0.1601, simple_loss=0.2333, pruned_loss=0.04351, over 15158.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2561, pruned_loss=0.05239, over 4708716.54 frames. ], batch size: 33, lr: 5.66e-03, grad_scale: 8.0 2023-09-30 07:15:03,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 07:15:03,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 07:15:05,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 07:15:05,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:15:06,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 07:15:10,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:10,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:15:12,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:14,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:18,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:15:18,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 07:15:21,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:21,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:15:22,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:15:22,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:22,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:22,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:15:22,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:15:23,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=636440.0, ans=0.025 2023-09-30 07:15:24,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 07:15:25,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:15:27,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:15:30,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:15:32,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 07:15:34,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:15:37,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=636506.6666666666, ans=0.125 2023-09-30 07:15:39,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:15:40,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 07:15:40,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=636506.6666666666, ans=0.125 2023-09-30 07:15:40,888 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=636506.6666666666, ans=0.0 2023-09-30 07:15:47,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:49,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=636506.6666666666, ans=0.125 2023-09-30 07:15:51,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=636506.6666666666, ans=0.125 2023-09-30 07:15:54,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:56,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:58,143 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=636573.3333333334, ans=0.125 2023-09-30 07:16:00,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:00,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:03,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 07:16:06,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:16:08,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:16:08,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:16:12,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:13,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:15,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 07:16:18,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:16:20,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:16:23,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:16:23,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:16:24,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:16:24,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:16:24,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:16:24,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:16:27,165 INFO [train.py:1039] (0/4) Epoch 18, batch 5200, loss[loss=0.1882, simple_loss=0.2676, pruned_loss=0.05442, over 24036.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2558, pruned_loss=0.05251, over 4714469.98 frames. ], batch size: 80, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:16:27,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=636706.6666666666, ans=0.125 2023-09-30 07:16:28,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:16:31,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:16:31,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=636706.6666666666, ans=0.2 2023-09-30 07:16:35,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:39,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 07:16:41,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:16:42,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:44,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:46,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:16:46,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:48,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 07:16:51,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:16:52,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:53,644 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=636773.3333333334, ans=0.125 2023-09-30 07:16:55,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 07:16:58,252 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.839e+02 2.029e+02 2.263e+02 2.861e+02, threshold=4.059e+02, percent-clipped=0.0 2023-09-30 07:16:58,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:16:59,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:17:00,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 07:17:01,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 07:17:03,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 07:17:03,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:03,187 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 07:17:03,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:17:04,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:06,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:17:06,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 07:17:08,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:09,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:14,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 07:17:14,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 07:17:14,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 07:17:19,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 07:17:19,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:17:25,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:17:27,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:28,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 07:17:28,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:28,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=636906.6666666666, ans=0.1 2023-09-30 07:17:30,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:17:30,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:30,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:17:30,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=636906.6666666666, ans=0.025 2023-09-30 07:17:34,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:36,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:17:39,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:39,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:39,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:40,786 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.59 vs. limit=10.0 2023-09-30 07:17:44,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:46,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 07:17:46,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=636973.3333333334, ans=0.125 2023-09-30 07:17:47,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:47,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:17:49,456 INFO [train.py:1039] (0/4) Epoch 18, batch 5250, loss[loss=0.1831, simple_loss=0.2666, pruned_loss=0.04986, over 24443.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2553, pruned_loss=0.05218, over 4706500.05 frames. ], batch size: 66, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:17:49,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:49,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:17:51,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:17:53,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:57,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:59,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:18:00,497 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.53 vs. limit=15.0 2023-09-30 07:18:01,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:18:06,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:18:08,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:18:09,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:18:11,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:18:13,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 07:18:13,038 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:18:14,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:18:23,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=637173.3333333334, ans=0.2 2023-09-30 07:18:30,848 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-09-30 07:18:40,321 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=637240.0, ans=0.125 2023-09-30 07:18:51,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=637306.6666666666, ans=0.0 2023-09-30 07:18:59,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=637306.6666666666, ans=0.125 2023-09-30 07:19:04,839 INFO [train.py:1039] (0/4) Epoch 18, batch 5300, loss[loss=0.1693, simple_loss=0.2534, pruned_loss=0.0426, over 24465.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2532, pruned_loss=0.05173, over 4699343.22 frames. ], batch size: 69, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:19:16,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=637373.3333333334, ans=0.1 2023-09-30 07:19:20,777 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-18.pt 2023-09-30 07:19:26,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:19:26,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 07:19:26,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 07:19:26,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:27,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:27,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:27,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:27,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:27,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:19:27,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:27,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:19:28,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:19:28,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 07:19:28,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 07:19:28,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 07:19:28,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:19:28,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 07:19:28,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 07:19:29,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:29,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:29,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:30,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:30,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:19:30,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:30,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:30,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:30,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:30,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:30,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:19:30,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:30,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:19:31,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 07:19:31,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:32,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:32,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 07:19:32,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 07:19:32,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:19:32,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:19:32,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 07:19:32,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 07:19:33,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:34,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:19:34,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:34,610 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 07:19:34,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 07:19:34,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:19:34,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:35,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 07:19:35,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 07:19:35,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 07:19:35,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:38,623 INFO [train.py:1039] (0/4) Epoch 19, batch 0, loss[loss=0.1702, simple_loss=0.2483, pruned_loss=0.04604, over 23743.00 frames. ], tot_loss[loss=0.1702, simple_loss=0.2483, pruned_loss=0.04604, over 23743.00 frames. ], batch size: 212, lr: 5.50e-03, grad_scale: 32.0 2023-09-30 07:19:38,624 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 07:19:52,801 INFO [train.py:1071] (0/4) Epoch 19, validation: loss=0.3241, simple_loss=0.2677, pruned_loss=0.1902, over 1125622.00 frames. 2023-09-30 07:19:52,801 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 07:19:54,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=637460.0, ans=0.125 2023-09-30 07:19:55,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 07:19:55,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:19:58,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:20:01,906 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.881e+02 2.156e+02 2.381e+02 5.566e+02, threshold=4.312e+02, percent-clipped=3.0 2023-09-30 07:20:06,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:06,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:20:06,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:07,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 07:20:08,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=637526.6666666666, ans=0.125 2023-09-30 07:20:09,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 07:20:12,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:12,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:17,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:19,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:19,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:20:19,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:20,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 07:20:20,974 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=637526.6666666666, ans=0.125 2023-09-30 07:20:22,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:31,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:20:31,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:34,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 07:20:39,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:20:39,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:20:41,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:45,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:20:47,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=637660.0, ans=0.125 2023-09-30 07:20:48,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:49,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=637660.0, ans=0.1 2023-09-30 07:20:53,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=637660.0, ans=0.1 2023-09-30 07:20:55,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 07:20:59,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 07:20:59,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:20:59,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:01,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:21:01,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:04,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 07:21:05,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:08,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:09,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=637726.6666666666, ans=0.0 2023-09-30 07:21:12,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:21:13,722 INFO [train.py:1039] (0/4) Epoch 19, batch 50, loss[loss=0.1965, simple_loss=0.2622, pruned_loss=0.06542, over 23727.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2554, pruned_loss=0.05208, over 1070340.55 frames. ], batch size: 164, lr: 5.50e-03, grad_scale: 16.0 2023-09-30 07:21:15,521 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 07:21:17,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:21:21,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:23,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:21:23,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 07:21:23,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:21:23,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=637793.3333333334, ans=0.125 2023-09-30 07:21:24,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:21:26,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:26,712 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=637793.3333333334, ans=0.125 2023-09-30 07:21:28,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:30,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:35,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 07:21:35,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:40,860 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=637860.0, ans=0.0 2023-09-30 07:21:42,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:21:43,034 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=637860.0, ans=0.0 2023-09-30 07:21:45,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 07:21:47,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 07:21:48,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:21:48,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:21:48,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:50,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:51,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:21:51,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:21:51,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:22:01,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:01,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:01,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:22:02,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=637993.3333333334, ans=0.0 2023-09-30 07:22:03,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 07:22:04,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:22:06,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:22:06,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 07:22:06,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=637993.3333333334, ans=0.1 2023-09-30 07:22:07,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:10,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 07:22:16,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:22:16,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:17,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:19,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:19,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:22,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 07:22:22,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 07:22:23,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:25,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:26,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:22:28,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:29,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 07:22:29,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 07:22:30,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:22:32,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:33,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:22:34,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 07:22:34,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 07:22:36,021 INFO [train.py:1039] (0/4) Epoch 19, batch 100, loss[loss=0.1765, simple_loss=0.2682, pruned_loss=0.04243, over 24691.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2584, pruned_loss=0.05233, over 1886508.11 frames. ], batch size: 73, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:22:36,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:37,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:39,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:22:39,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:22:39,485 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=638126.6666666666, ans=0.2 2023-09-30 07:22:42,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:22:45,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:22:47,049 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.406e+02 1.850e+02 1.971e+02 2.245e+02 4.662e+02, threshold=3.942e+02, percent-clipped=2.0 2023-09-30 07:22:50,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:22:50,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 07:22:50,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:55,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=638193.3333333334, ans=0.125 2023-09-30 07:22:57,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:22:57,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:57,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=638193.3333333334, ans=0.0 2023-09-30 07:22:58,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:58,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:58,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:59,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 07:23:02,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:23:02,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:02,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:02,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:23:06,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 07:23:06,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:08,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:09,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:23:11,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:23:12,261 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=638260.0, ans=0.125 2023-09-30 07:23:15,051 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 07:23:15,078 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 07:23:16,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:16,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:23:21,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:23:23,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:26,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:30,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:30,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638326.6666666666, ans=0.1 2023-09-30 07:23:31,716 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 07:23:33,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:23:34,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=638326.6666666666, ans=0.125 2023-09-30 07:23:35,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:23:37,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:23:40,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:44,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:47,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:23:47,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:23:50,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:51,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:53,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:53,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:23:55,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:55,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 07:23:55,287 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 07:23:55,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:56,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:23:56,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:23:56,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:58,190 INFO [train.py:1039] (0/4) Epoch 19, batch 150, loss[loss=0.1952, simple_loss=0.2724, pruned_loss=0.05901, over 24408.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2584, pruned_loss=0.0528, over 2513841.94 frames. ], batch size: 77, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:23:58,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 07:23:58,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:23:58,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:23:58,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:23:59,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:01,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:01,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:24:02,822 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.65 vs. limit=15.0 2023-09-30 07:24:03,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:24:03,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638460.0, ans=0.1 2023-09-30 07:24:06,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:11,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:24:11,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:12,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:14,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:15,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:17,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:24:17,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:17,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=638526.6666666666, ans=0.0 2023-09-30 07:24:22,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 07:24:22,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 07:24:22,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 07:24:25,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:24:25,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:24:27,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:24:29,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:24:29,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:29,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:29,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:32,368 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 07:24:33,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:36,281 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.42 vs. limit=15.0 2023-09-30 07:24:38,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=638593.3333333334, ans=0.125 2023-09-30 07:24:41,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:45,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:24:45,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 07:24:50,046 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.65 vs. limit=15.0 2023-09-30 07:24:50,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:24:50,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:50,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:24:52,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:24:54,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:54,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:24:57,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:57,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 07:24:57,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=638660.0, ans=0.015 2023-09-30 07:25:01,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:01,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=638660.0, ans=0.0 2023-09-30 07:25:03,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:03,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:25:03,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:25:06,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:09,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 07:25:11,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:25:12,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:25:14,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:14,661 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638726.6666666666, ans=0.1 2023-09-30 07:25:14,786 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=638726.6666666666, ans=0.125 2023-09-30 07:25:16,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:25:16,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 07:25:16,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:25:16,210 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 07:25:19,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:20,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=638793.3333333334, ans=0.125 2023-09-30 07:25:21,391 INFO [train.py:1039] (0/4) Epoch 19, batch 200, loss[loss=0.1997, simple_loss=0.2674, pruned_loss=0.06606, over 22800.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2577, pruned_loss=0.05225, over 3007906.27 frames. ], batch size: 322, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:25:24,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:25:24,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:25:26,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=638793.3333333334, ans=0.125 2023-09-30 07:25:28,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 07:25:28,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:28,556 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=638793.3333333334, ans=0.125 2023-09-30 07:25:29,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:31,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=638793.3333333334, ans=0.0 2023-09-30 07:25:32,721 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.866e+02 2.060e+02 2.341e+02 3.608e+02, threshold=4.119e+02, percent-clipped=0.0 2023-09-30 07:25:33,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 07:25:34,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:25:36,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:38,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:40,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:25:40,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:40,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:58,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:26:00,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:26:00,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:26:00,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:26:02,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:26:02,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:26:03,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:03,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:26:05,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:05,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:06,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=638926.6666666666, ans=0.0 2023-09-30 07:26:07,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=638926.6666666666, ans=0.1 2023-09-30 07:26:08,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 07:26:08,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:26:08,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:13,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:26:15,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=638993.3333333334, ans=0.025 2023-09-30 07:26:19,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:19,401 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=638993.3333333334, ans=0.125 2023-09-30 07:26:25,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:26,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:26:33,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:35,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 07:26:36,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:36,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:26:36,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:38,363 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:26:39,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:26:41,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 07:26:42,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:26:42,602 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 07:26:43,975 INFO [train.py:1039] (0/4) Epoch 19, batch 250, loss[loss=0.1704, simple_loss=0.2451, pruned_loss=0.04789, over 23460.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2569, pruned_loss=0.05323, over 3365596.16 frames. ], batch size: 134, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:26:45,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:47,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:26:50,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:50,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:54,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:26:54,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:58,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:27:00,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:02,650 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=639193.3333333334, ans=0.015 2023-09-30 07:27:12,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:13,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:27:15,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:27:18,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:27:20,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:27:21,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:27:21,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:23,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:27:25,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:27:27,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:30,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:27:33,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 07:27:33,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:35,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:27:35,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:27:35,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:27:36,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:27:37,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:27:37,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:27:40,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:41,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:27:43,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:46,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:27:51,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:54,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:57,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:59,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:28:03,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 07:28:05,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:05,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:28:06,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 07:28:07,421 INFO [train.py:1039] (0/4) Epoch 19, batch 300, loss[loss=0.1723, simple_loss=0.2536, pruned_loss=0.04551, over 24654.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2551, pruned_loss=0.05257, over 3662067.58 frames. ], batch size: 65, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:28:07,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:28:07,908 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=639460.0, ans=0.1 2023-09-30 07:28:09,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:28:09,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 07:28:10,982 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=639460.0, ans=0.125 2023-09-30 07:28:12,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:28:13,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:28:16,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:28:18,704 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.819e+02 2.024e+02 2.204e+02 2.893e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-30 07:28:18,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 07:28:20,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:28:21,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:28:21,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 07:28:21,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:26,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:28:32,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:28:32,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 07:28:35,199 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.27 vs. limit=15.0 2023-09-30 07:28:36,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=639526.6666666666, ans=0.125 2023-09-30 07:28:37,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 07:28:37,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:41,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:42,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:42,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 07:28:42,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:28:44,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:28:46,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:28:47,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:52,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:28:52,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 07:28:52,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:28:52,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=639593.3333333334, ans=0.2 2023-09-30 07:28:56,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:56,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 07:28:57,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:02,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:29:06,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:29:06,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 07:29:08,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=639660.0, ans=0.125 2023-09-30 07:29:11,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:11,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:29:14,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:17,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:29:17,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 07:29:17,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:29:18,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:19,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 07:29:22,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:22,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:23,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:23,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:24,532 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.53 vs. limit=10.0 2023-09-30 07:29:25,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:28,613 INFO [train.py:1039] (0/4) Epoch 19, batch 350, loss[loss=0.1775, simple_loss=0.2528, pruned_loss=0.05107, over 24668.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2532, pruned_loss=0.05162, over 3903136.89 frames. ], batch size: 65, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:29:30,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:30,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:29:33,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:40,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:41,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:43,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:46,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 07:29:47,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:47,598 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.52 vs. limit=15.0 2023-09-30 07:29:48,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 07:29:50,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:52,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 07:29:53,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:55,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 07:29:58,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:29:59,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:30:01,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:30:01,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:01,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:03,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:03,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:03,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:30:06,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:06,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:14,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:30:14,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:30:15,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:30:17,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:18,881 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-96000.pt 2023-09-30 07:30:25,607 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=639993.3333333334, ans=0.2 2023-09-30 07:30:26,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 07:30:26,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:30,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:30,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:30,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:30:33,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 07:30:35,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:36,556 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 07:30:36,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 07:30:36,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:38,628 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:30:39,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:40,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 07:30:43,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:44,229 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.58 vs. limit=15.0 2023-09-30 07:30:48,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:30:48,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:50,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:50,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:52,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:52,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=640060.0, ans=0.125 2023-09-30 07:30:55,827 INFO [train.py:1039] (0/4) Epoch 19, batch 400, loss[loss=0.1836, simple_loss=0.2539, pruned_loss=0.05664, over 23750.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2524, pruned_loss=0.05137, over 4086296.25 frames. ], batch size: 179, lr: 5.49e-03, grad_scale: 32.0 2023-09-30 07:30:56,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:59,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:30:59,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 07:30:59,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:00,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:03,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:31:03,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:06,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:06,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:07,645 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.862e+02 2.041e+02 2.218e+02 3.370e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-30 07:31:07,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 07:31:08,562 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.90 vs. limit=15.0 2023-09-30 07:31:10,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 07:31:10,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:13,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 07:31:13,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:14,538 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.94 vs. limit=15.0 2023-09-30 07:31:17,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:31:17,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:17,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 07:31:19,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:31:19,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:19,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:20,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:20,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=640193.3333333334, ans=0.0 2023-09-30 07:31:22,312 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 07:31:25,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 07:31:30,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:31,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:32,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 07:31:32,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 07:31:35,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:31:38,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:31:46,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 07:31:47,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:31:49,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 07:31:51,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:53,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:31:54,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 07:31:59,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:32:00,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=640326.6666666666, ans=0.125 2023-09-30 07:32:03,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:32:05,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:32:06,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:08,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 07:32:11,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:32:11,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 07:32:12,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:32:12,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:32:15,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 07:32:19,083 INFO [train.py:1039] (0/4) Epoch 19, batch 450, loss[loss=0.1824, simple_loss=0.2588, pruned_loss=0.05299, over 23464.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2529, pruned_loss=0.0516, over 4229264.05 frames. ], batch size: 134, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:32:19,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:32:20,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:32:20,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:32:22,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 07:32:22,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:32:22,537 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=640460.0, ans=0.0 2023-09-30 07:32:23,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:32:25,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:32:25,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 07:32:25,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:32:26,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:32:30,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:32:39,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:39,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:32:40,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 07:32:42,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 07:32:46,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:32:49,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:51,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:32:55,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:55,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:58,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 07:32:58,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 07:33:01,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 07:33:01,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=640593.3333333334, ans=0.125 2023-09-30 07:33:03,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:03,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=640593.3333333334, ans=0.1 2023-09-30 07:33:04,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:06,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:33:06,563 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 07:33:06,576 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 07:33:06,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=640660.0, ans=0.1 2023-09-30 07:33:08,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:33:10,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:33:11,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:33:14,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:33:14,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:33:14,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:33:16,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 07:33:16,747 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=640660.0, ans=0.125 2023-09-30 07:33:17,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:18,173 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=640660.0, ans=0.0 2023-09-30 07:33:19,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=640660.0, ans=0.1 2023-09-30 07:33:20,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:33:20,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:33:22,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 07:33:25,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:33:27,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 07:33:29,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 07:33:29,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:31,525 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.05 vs. limit=6.0 2023-09-30 07:33:32,839 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=640726.6666666666, ans=0.125 2023-09-30 07:33:34,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:33:36,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:33:37,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:33:39,167 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 07:33:40,602 INFO [train.py:1039] (0/4) Epoch 19, batch 500, loss[loss=0.204, simple_loss=0.2667, pruned_loss=0.07066, over 23748.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2535, pruned_loss=0.05191, over 4343344.55 frames. ], batch size: 164, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:33:44,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:44,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:33:46,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:46,704 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 07:33:49,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 07:33:49,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:52,499 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.803e+02 2.032e+02 2.368e+02 3.527e+02, threshold=4.065e+02, percent-clipped=0.0 2023-09-30 07:33:52,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:33:57,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:33:58,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:34:00,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:34:00,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:34:00,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:04,959 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.06 vs. limit=15.0 2023-09-30 07:34:12,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:12,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:34:12,678 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=640926.6666666666, ans=0.125 2023-09-30 07:34:14,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:34:14,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:14,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 07:34:15,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:34:17,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:34:19,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:34:20,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:34:20,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:22,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 07:34:25,904 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 07:34:30,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:30,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:31,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:31,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:32,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:34:35,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 07:34:38,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:34:40,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:44,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:34:46,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:53,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:55,441 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=641060.0, ans=0.0 2023-09-30 07:34:57,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 07:34:57,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:57,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:00,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 07:35:00,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:35:03,226 INFO [train.py:1039] (0/4) Epoch 19, batch 550, loss[loss=0.1898, simple_loss=0.2527, pruned_loss=0.06349, over 23750.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2548, pruned_loss=0.05212, over 4440932.24 frames. ], batch size: 195, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:35:03,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:07,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 07:35:08,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=641126.6666666666, ans=0.0 2023-09-30 07:35:09,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 07:35:09,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:09,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 07:35:11,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:35:11,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:11,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:12,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:12,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:35:14,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:35:17,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:18,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 07:35:18,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:35:21,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=641193.3333333334, ans=0.1 2023-09-30 07:35:24,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:24,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:26,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:28,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:28,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=641193.3333333334, ans=0.0 2023-09-30 07:35:30,924 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.10 vs. limit=6.0 2023-09-30 07:35:32,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=641193.3333333334, ans=0.0 2023-09-30 07:35:33,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 07:35:33,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 07:35:36,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:35:42,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:35:42,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:43,088 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.38 vs. limit=15.0 2023-09-30 07:35:43,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:35:47,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:47,175 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 07:35:48,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:50,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:35:50,986 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=641326.6666666666, ans=0.125 2023-09-30 07:35:52,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:53,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:35:53,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:35:55,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:56,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 07:35:57,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 07:35:59,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:59,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:59,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:35:59,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:36:05,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:36:06,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:36:08,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:36:09,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:09,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:36:12,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:36:12,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:14,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:36:14,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:15,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:36:15,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:36:22,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 07:36:24,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 07:36:24,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=641460.0, ans=0.025 2023-09-30 07:36:25,847 INFO [train.py:1039] (0/4) Epoch 19, batch 600, loss[loss=0.1664, simple_loss=0.2512, pruned_loss=0.04078, over 24549.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2548, pruned_loss=0.05249, over 4497499.83 frames. ], batch size: 71, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:36:26,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:36:27,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:36:27,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:31,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=641460.0, ans=0.125 2023-09-30 07:36:36,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:36:37,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:36:39,272 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.814e+02 2.073e+02 2.344e+02 3.797e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 07:36:39,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 07:36:41,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:36:42,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:36:45,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:46,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=641526.6666666666, ans=0.04949747468305833 2023-09-30 07:36:47,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 07:36:47,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:36:55,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 07:36:58,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:36:58,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:58,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:37:02,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=641593.3333333334, ans=0.1 2023-09-30 07:37:04,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:37:04,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:37:05,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:11,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:37:17,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:17,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:37:17,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:37:25,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 07:37:30,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:37:31,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:37:36,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 07:37:36,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:37:40,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 07:37:40,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:37:40,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:37:46,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:37:48,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:37:49,748 INFO [train.py:1039] (0/4) Epoch 19, batch 650, loss[loss=0.1888, simple_loss=0.2701, pruned_loss=0.05375, over 23972.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2548, pruned_loss=0.05299, over 4540798.20 frames. ], batch size: 80, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:37:49,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:37:51,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:37:53,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:37:56,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 07:37:57,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:38:03,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:38:03,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:08,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:14,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 07:38:15,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=641860.0, ans=0.125 2023-09-30 07:38:16,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:16,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:20,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:20,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:38:23,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:23,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:24,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:38:24,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:25,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=641926.6666666666, ans=0.0 2023-09-30 07:38:26,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:38:26,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=641926.6666666666, ans=0.0 2023-09-30 07:38:27,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:38:27,975 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 07:38:27,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:29,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:32,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:32,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:32,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=641926.6666666666, ans=0.1 2023-09-30 07:38:34,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:34,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:38:35,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 07:38:37,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:38:37,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:38:37,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:38:37,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:39,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:38:41,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 07:38:43,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 07:38:43,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:43,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:44,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:38:44,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:46,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:51,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:53,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:54,668 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:59,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:59,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:39:00,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:39:07,427 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.42 vs. limit=15.0 2023-09-30 07:39:08,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:39:08,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:08,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:08,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:11,472 INFO [train.py:1039] (0/4) Epoch 19, batch 700, loss[loss=0.1649, simple_loss=0.2407, pruned_loss=0.0446, over 24480.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2539, pruned_loss=0.05231, over 4582261.73 frames. ], batch size: 63, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:39:14,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 07:39:16,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 07:39:19,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 07:39:20,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:22,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:39:22,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 07:39:25,590 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.802e+02 1.961e+02 2.175e+02 2.904e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 07:39:27,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:29,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:39:30,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:32,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:39:33,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:39:36,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:39,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:39:39,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:39:41,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 07:39:41,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=642193.3333333334, ans=0.1 2023-09-30 07:39:44,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 07:39:47,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:39:47,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:39:50,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:39:56,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:39:57,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 07:40:03,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:03,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:40:03,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=642326.6666666666, ans=0.0 2023-09-30 07:40:04,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 07:40:06,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=642326.6666666666, ans=0.0 2023-09-30 07:40:09,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:40:10,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:13,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:40:17,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:40:18,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 07:40:20,617 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=642393.3333333334, ans=0.125 2023-09-30 07:40:21,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 07:40:23,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 07:40:27,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:29,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:29,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:40:29,993 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=642393.3333333334, ans=0.1 2023-09-30 07:40:33,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:33,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 07:40:34,987 INFO [train.py:1039] (0/4) Epoch 19, batch 750, loss[loss=0.1733, simple_loss=0.2533, pruned_loss=0.04668, over 23299.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2534, pruned_loss=0.0518, over 4611681.00 frames. ], batch size: 105, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:40:38,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 07:40:38,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 07:40:39,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 07:40:41,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 07:40:41,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 07:40:41,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:40:42,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 07:40:44,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:44,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:40:45,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:47,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:48,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:40:49,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:49,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=642526.6666666666, ans=0.125 2023-09-30 07:40:50,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:40:50,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:40:53,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:40:55,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:55,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:55,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=642526.6666666666, ans=0.0 2023-09-30 07:40:56,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 07:40:58,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:40:58,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:41:00,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:41:00,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=642526.6666666666, ans=0.09899494936611666 2023-09-30 07:41:03,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:41:04,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 07:41:04,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:04,843 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.23 vs. limit=15.0 2023-09-30 07:41:07,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 07:41:07,704 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 07:41:09,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 07:41:09,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:41:09,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:41:11,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:41:19,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:41:19,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:19,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:41:20,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:41:24,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:41:24,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 07:41:24,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:41:25,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:41:27,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:41:27,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=642660.0, ans=0.0 2023-09-30 07:41:30,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:41:30,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 07:41:31,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:37,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:41:39,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:41:41,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:41:43,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:41:47,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 07:41:47,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:41:49,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:55,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:56,727 INFO [train.py:1039] (0/4) Epoch 19, batch 800, loss[loss=0.201, simple_loss=0.2802, pruned_loss=0.0609, over 24038.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2544, pruned_loss=0.05234, over 4621637.25 frames. ], batch size: 80, lr: 5.47e-03, grad_scale: 32.0 2023-09-30 07:41:56,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:42:03,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:03,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:04,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:42:04,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:06,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:07,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:09,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:10,439 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.847e+02 2.108e+02 2.482e+02 4.355e+02, threshold=4.217e+02, percent-clipped=1.0 2023-09-30 07:42:14,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:14,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:42:19,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 07:42:19,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:20,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:20,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:42:22,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:22,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 07:42:22,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:23,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 07:42:27,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:30,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:33,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:42:33,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:34,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:34,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:40,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:42:40,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:42:42,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:42:42,946 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 07:42:42,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 07:42:44,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:42:44,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:46,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:46,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:42:52,299 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 07:42:52,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 07:42:55,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:42:56,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:43:01,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:43:04,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:04,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 07:43:04,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=643060.0, ans=0.07 2023-09-30 07:43:06,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:43:07,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 07:43:09,738 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=643060.0, ans=0.125 2023-09-30 07:43:14,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:14,107 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=643060.0, ans=0.125 2023-09-30 07:43:17,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:43:19,041 INFO [train.py:1039] (0/4) Epoch 19, batch 850, loss[loss=0.1696, simple_loss=0.2514, pruned_loss=0.04394, over 24471.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2553, pruned_loss=0.05252, over 4642897.54 frames. ], batch size: 63, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:43:19,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 07:43:19,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:43:19,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:21,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 07:43:21,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:24,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:43:26,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:26,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:43:28,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:43:29,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 07:43:29,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 07:43:29,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 07:43:32,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:32,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:43:34,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:34,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:35,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:43:39,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:39,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=643193.3333333334, ans=0.125 2023-09-30 07:43:40,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:40,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 07:43:43,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 07:43:47,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:48,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 07:43:53,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 07:43:53,608 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=643260.0, ans=0.2 2023-09-30 07:43:56,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 07:43:57,824 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 07:43:57,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:43:57,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:43:59,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:44:01,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 07:44:05,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:44:05,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:07,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:44:08,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:44:10,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:44:11,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:44:13,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 07:44:13,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=643326.6666666666, ans=0.125 2023-09-30 07:44:17,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:44:17,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:19,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:44:19,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:19,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:23,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:24,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:44:26,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:44:27,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:28,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:44:35,656 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=643393.3333333334, ans=0.0 2023-09-30 07:44:36,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:44:38,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:38,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 07:44:38,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:39,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:41,346 INFO [train.py:1039] (0/4) Epoch 19, batch 900, loss[loss=0.2212, simple_loss=0.2822, pruned_loss=0.08006, over 23721.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2555, pruned_loss=0.05248, over 4664635.94 frames. ], batch size: 232, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:44:42,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 07:44:44,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=643460.0, ans=0.125 2023-09-30 07:44:48,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:44:50,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:52,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 07:44:53,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:44:55,222 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.916e+02 2.182e+02 2.478e+02 5.058e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 07:44:55,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 07:44:55,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:44:56,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:57,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:44:59,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:44:59,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:45:01,001 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=643526.6666666666, ans=0.125 2023-09-30 07:45:05,433 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=643526.6666666666, ans=0.2 2023-09-30 07:45:07,433 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.76 vs. limit=15.0 2023-09-30 07:45:10,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:10,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:45:11,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:45:13,284 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.73 vs. limit=22.5 2023-09-30 07:45:14,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:18,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 07:45:20,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:45:22,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=643593.3333333334, ans=0.0 2023-09-30 07:45:24,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:45:25,846 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.62 vs. limit=12.0 2023-09-30 07:45:26,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:45:26,586 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 07:45:26,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 07:45:33,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:45:33,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:45:33,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:45:41,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:41,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:45:44,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 07:45:44,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:48,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 07:45:50,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:45:50,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:52,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:45:52,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:45:56,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 07:45:56,739 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 07:45:58,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:45:59,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 07:46:01,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:02,773 INFO [train.py:1039] (0/4) Epoch 19, batch 950, loss[loss=0.1823, simple_loss=0.2658, pruned_loss=0.0494, over 24391.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2558, pruned_loss=0.05258, over 4673865.71 frames. ], batch size: 77, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:46:04,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 07:46:10,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:13,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:13,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:15,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:46:17,592 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 07:46:22,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:23,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:23,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:25,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:46:25,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 07:46:26,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:46:28,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:28,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 07:46:30,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:33,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:33,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:34,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:34,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 07:46:36,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=643926.6666666666, ans=0.07 2023-09-30 07:46:37,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:46:38,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=643926.6666666666, ans=0.125 2023-09-30 07:46:39,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:43,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:46:50,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:46:50,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:53,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 07:46:57,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:46:57,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:46:58,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:46:58,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:58,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:47:02,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 07:47:03,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:47:05,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:05,465 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=643993.3333333334, ans=0.1 2023-09-30 07:47:06,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:06,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 07:47:06,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:06,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:47:08,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 07:47:10,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=644060.0, ans=0.125 2023-09-30 07:47:12,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:47:15,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:21,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:23,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 07:47:23,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 07:47:26,774 INFO [train.py:1039] (0/4) Epoch 19, batch 1000, loss[loss=0.17, simple_loss=0.2289, pruned_loss=0.05559, over 23411.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2549, pruned_loss=0.05209, over 4689217.28 frames. ], batch size: 285, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:47:26,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:30,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 07:47:32,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:47:36,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:47:37,076 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=644126.6666666666, ans=0.2 2023-09-30 07:47:38,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 07:47:38,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 07:47:42,700 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 2.138e+02 2.514e+02 3.202e+02 5.752e+02, threshold=5.028e+02, percent-clipped=6.0 2023-09-30 07:47:42,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:47:44,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:45,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:46,461 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=644193.3333333334, ans=0.0 2023-09-30 07:47:47,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 07:47:51,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 07:47:53,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 07:47:53,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:47:56,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 07:47:56,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 07:47:56,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 07:47:58,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:00,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:09,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:09,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:48:11,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:11,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:11,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 07:48:11,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:13,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:48:14,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:14,638 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 07:48:14,941 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=644326.6666666666, ans=0.1 2023-09-30 07:48:18,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 07:48:19,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 07:48:20,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 07:48:22,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:48:29,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:29,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:48:29,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:32,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:48:34,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 07:48:36,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:48:36,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 07:48:37,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 07:48:40,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:48:40,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:43,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:48:46,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:48:47,169 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=644460.0, ans=0.1 2023-09-30 07:48:48,090 INFO [train.py:1039] (0/4) Epoch 19, batch 1050, loss[loss=0.1666, simple_loss=0.2339, pruned_loss=0.04962, over 23755.00 frames. ], tot_loss[loss=0.179, simple_loss=0.254, pruned_loss=0.05197, over 4680471.12 frames. ], batch size: 232, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:48:48,290 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:50,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:48:50,390 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=644460.0, ans=0.2 2023-09-30 07:48:51,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:48:53,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:48:54,846 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:58,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:48:58,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=644460.0, ans=0.125 2023-09-30 07:49:00,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:49:01,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:49:05,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:49:06,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:49:06,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:49:07,587 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=10.42 vs. limit=15.0 2023-09-30 07:49:08,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:49:08,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 07:49:09,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:09,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 07:49:14,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:49:14,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 07:49:15,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:49:20,658 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.12 vs. limit=10.0 2023-09-30 07:49:20,943 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.49 vs. limit=15.0 2023-09-30 07:49:21,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:49:21,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:49:21,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:23,929 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.53 vs. limit=10.0 2023-09-30 07:49:24,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 07:49:24,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 07:49:26,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:49:27,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 07:49:31,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 07:49:33,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:49:37,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=644660.0, ans=0.2 2023-09-30 07:49:38,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:49:39,784 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.85 vs. limit=12.0 2023-09-30 07:49:40,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:49:40,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:49:41,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:49:45,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:49:48,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 07:49:50,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 07:49:50,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 07:49:51,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:51,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:49:53,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 07:49:56,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:49:58,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:58,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:49:58,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:49:59,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:04,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:06,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 07:50:07,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:50:07,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 07:50:08,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 07:50:08,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:50:11,433 INFO [train.py:1039] (0/4) Epoch 19, batch 1100, loss[loss=0.1761, simple_loss=0.2446, pruned_loss=0.05377, over 23734.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2532, pruned_loss=0.05135, over 4694201.47 frames. ], batch size: 232, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:50:11,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:50:14,112 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=644793.3333333334, ans=0.125 2023-09-30 07:50:15,796 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=644793.3333333334, ans=0.0 2023-09-30 07:50:15,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=644793.3333333334, ans=0.1 2023-09-30 07:50:17,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:50:23,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:50:25,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:50:25,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:26,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 07:50:26,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:50:28,193 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.773e+02 2.054e+02 2.605e+02 4.840e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 07:50:29,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:50:31,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:50:34,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:50:34,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 07:50:36,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:50:39,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:39,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:50:41,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:50:44,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:50:49,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:50:51,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 07:50:53,152 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 07:50:53,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:56,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:58,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:50:59,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:51:00,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 07:51:01,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:51:01,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:51:01,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:51:01,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:01,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 07:51:08,157 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:51:08,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 07:51:10,730 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.03 vs. limit=22.5 2023-09-30 07:51:11,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:51:18,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:51:20,235 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.38 vs. limit=15.0 2023-09-30 07:51:21,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 07:51:21,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:51:22,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:26,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:26,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:28,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 07:51:28,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:51:28,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:29,293 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.41 vs. limit=15.0 2023-09-30 07:51:30,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 07:51:30,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:51:30,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 07:51:31,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:51:33,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:51:34,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:51:35,360 INFO [train.py:1039] (0/4) Epoch 19, batch 1150, loss[loss=0.1811, simple_loss=0.273, pruned_loss=0.04464, over 24326.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.254, pruned_loss=0.0515, over 4705817.64 frames. ], batch size: 74, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:51:40,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:43,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:51:44,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:44,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:51:46,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 07:51:46,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:51:49,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 07:51:52,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:52,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:51:56,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 07:51:58,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:03,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:52:05,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:06,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 07:52:07,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:52:07,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:52:11,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 07:52:13,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:14,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:52:15,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=645260.0, ans=0.125 2023-09-30 07:52:15,798 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.55 vs. limit=12.0 2023-09-30 07:52:25,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:33,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:33,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 07:52:34,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:34,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:41,961 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 07:52:44,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:45,853 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=645393.3333333334, ans=0.125 2023-09-30 07:52:50,475 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 07:52:55,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:52:58,258 INFO [train.py:1039] (0/4) Epoch 19, batch 1200, loss[loss=0.1606, simple_loss=0.2426, pruned_loss=0.03924, over 24551.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2541, pruned_loss=0.05202, over 4708134.78 frames. ], batch size: 71, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:52:58,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:52:58,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:52:58,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:53:01,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:01,941 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=645460.0, ans=0.0 2023-09-30 07:53:07,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=645460.0, ans=0.125 2023-09-30 07:53:08,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:53:08,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:53:09,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:09,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:09,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:53:11,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:53:11,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=645460.0, ans=0.0 2023-09-30 07:53:13,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:53:14,392 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.937e+02 2.117e+02 2.458e+02 3.944e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-30 07:53:14,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:14,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:18,248 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 07:53:21,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 07:53:23,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:53:26,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:53:29,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:29,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:53:30,815 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 07:53:30,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:41,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:53:41,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:53:41,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 07:53:41,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:53:44,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 07:53:46,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=645660.0, ans=0.1 2023-09-30 07:53:48,343 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.27 vs. limit=15.0 2023-09-30 07:53:51,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 07:53:51,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:52,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:54,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:53:56,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:53:57,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:57,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:53:59,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:53:59,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 07:54:01,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:54:01,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:01,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:54:02,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:02,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:06,029 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=645726.6666666666, ans=0.125 2023-09-30 07:54:06,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=645726.6666666666, ans=0.0 2023-09-30 07:54:07,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:54:09,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:54:11,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=645726.6666666666, ans=0.0 2023-09-30 07:54:13,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 07:54:17,630 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 07:54:19,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:19,924 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.77 vs. limit=6.0 2023-09-30 07:54:20,555 INFO [train.py:1039] (0/4) Epoch 19, batch 1250, loss[loss=0.1825, simple_loss=0.271, pruned_loss=0.04702, over 24606.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2545, pruned_loss=0.05241, over 4709449.50 frames. ], batch size: 68, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:54:22,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:23,353 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.72 vs. limit=15.0 2023-09-30 07:54:24,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:54:24,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:27,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 07:54:31,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:54:32,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:32,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 07:54:35,503 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.01 vs. limit=22.5 2023-09-30 07:54:36,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:54:37,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:54:39,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:54:40,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:42,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:54:43,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:44,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:54:49,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:54:49,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:54:49,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:51,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:52,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:54:56,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:56,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:55:03,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 07:55:03,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:55:06,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:06,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 07:55:08,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:55:08,302 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 07:55:08,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:08,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:08,681 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=645926.6666666666, ans=0.025 2023-09-30 07:55:13,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:17,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:55:19,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 07:55:19,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 07:55:21,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 07:55:24,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:26,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 07:55:26,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:26,493 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=646060.0, ans=0.125 2023-09-30 07:55:31,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 07:55:31,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:55:32,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 07:55:32,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:55:32,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:55:32,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:55:33,777 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.71 vs. limit=10.0 2023-09-30 07:55:34,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:34,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=646060.0, ans=0.0 2023-09-30 07:55:35,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 07:55:39,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:39,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:55:40,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:55:44,141 INFO [train.py:1039] (0/4) Epoch 19, batch 1300, loss[loss=0.1682, simple_loss=0.2426, pruned_loss=0.04693, over 23524.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2552, pruned_loss=0.05288, over 4702333.97 frames. ], batch size: 134, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:55:44,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:55:46,749 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.11 vs. limit=15.0 2023-09-30 07:55:47,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:48,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 07:55:51,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:54,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:55:54,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:55:56,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:59,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:56:00,443 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.929e+02 2.084e+02 2.401e+02 3.525e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-30 07:56:00,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 07:56:02,514 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=646193.3333333334, ans=0.0 2023-09-30 07:56:05,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:56:07,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:56:08,192 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.01 vs. limit=12.0 2023-09-30 07:56:08,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 07:56:12,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:56:14,997 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.59 vs. limit=15.0 2023-09-30 07:56:15,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:15,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:16,562 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.98 vs. limit=6.0 2023-09-30 07:56:17,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:56:19,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:21,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:56:21,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:56:22,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 07:56:29,021 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=646260.0, ans=0.125 2023-09-30 07:56:30,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:56:30,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:56:32,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 07:56:32,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:56:34,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:56:35,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:56:35,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 07:56:36,773 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.13 vs. limit=15.0 2023-09-30 07:56:37,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:37,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 07:56:38,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:39,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=646326.6666666666, ans=0.0 2023-09-30 07:56:44,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:44,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:56:49,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 07:56:49,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 07:56:51,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 07:56:55,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:56:57,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 07:56:59,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:06,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 07:57:07,835 INFO [train.py:1039] (0/4) Epoch 19, batch 1350, loss[loss=0.1813, simple_loss=0.2304, pruned_loss=0.06606, over 19313.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2533, pruned_loss=0.0526, over 4690810.19 frames. ], batch size: 388, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:57:10,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:13,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:14,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:16,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:19,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:57:19,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:21,779 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=646460.0, ans=0.0 2023-09-30 07:57:26,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:27,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 07:57:29,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:57:29,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:57:31,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 07:57:32,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:57:33,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:57:33,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 07:57:34,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 07:57:36,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 07:57:38,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:38,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 07:57:54,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:04,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:05,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:05,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 07:58:09,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:09,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 07:58:09,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:58:11,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:58:14,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:58:16,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 07:58:17,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:58:18,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=646726.6666666666, ans=0.07 2023-09-30 07:58:24,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 07:58:26,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 07:58:31,426 INFO [train.py:1039] (0/4) Epoch 19, batch 1400, loss[loss=0.1672, simple_loss=0.2486, pruned_loss=0.04294, over 24444.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2523, pruned_loss=0.05248, over 4695850.51 frames. ], batch size: 63, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:58:33,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 07:58:34,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:36,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:58:36,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=646793.3333333334, ans=0.0 2023-09-30 07:58:38,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:58:43,344 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=646793.3333333334, ans=0.0 2023-09-30 07:58:44,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 07:58:45,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 07:58:49,380 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.898e+02 2.160e+02 2.671e+02 3.929e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-30 07:58:55,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:58:57,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:00,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:59:00,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:59:04,084 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.07 vs. limit=15.0 2023-09-30 07:59:04,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:59:06,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:59:07,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=646926.6666666666, ans=0.0 2023-09-30 07:59:16,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:16,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:18,707 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.22 vs. limit=15.0 2023-09-30 07:59:22,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 07:59:24,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:59:24,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:59:26,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:59:26,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:27,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:59:27,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:59:29,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:59:30,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 07:59:32,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:59:35,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:39,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:59:47,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 07:59:49,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:59:49,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:59:51,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:59:52,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:59:52,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:59:54,273 INFO [train.py:1039] (0/4) Epoch 19, batch 1450, loss[loss=0.1574, simple_loss=0.2315, pruned_loss=0.04166, over 24285.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2521, pruned_loss=0.05191, over 4708854.02 frames. ], batch size: 56, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:59:55,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:59:59,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:59:59,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:59,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 08:00:04,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:04,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:00:06,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:00:07,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 08:00:07,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:00:09,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 08:00:11,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:11,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:11,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 08:00:14,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:14,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:00:14,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 08:00:14,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:14,680 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=647193.3333333334, ans=0.0 2023-09-30 08:00:16,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:00:18,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:21,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:24,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:00:24,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:00:24,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=647193.3333333334, ans=0.0 2023-09-30 08:00:27,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:27,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:30,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:30,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:00:30,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:32,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:35,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 08:00:37,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:40,642 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 08:00:42,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:00:43,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:00:45,365 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:45,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 08:00:46,254 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.93 vs. limit=15.0 2023-09-30 08:00:49,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:50,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 08:00:52,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 08:00:55,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:58,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:00:58,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:01:00,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 08:01:01,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=647393.3333333334, ans=0.125 2023-09-30 08:01:03,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 08:01:03,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 08:01:05,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:06,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:01:16,689 INFO [train.py:1039] (0/4) Epoch 19, batch 1500, loss[loss=0.1644, simple_loss=0.2544, pruned_loss=0.03718, over 24357.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2531, pruned_loss=0.05182, over 4723672.14 frames. ], batch size: 74, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 08:01:16,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 08:01:16,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:01:16,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:01:17,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=647460.0, ans=0.125 2023-09-30 08:01:18,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:18,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:19,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:01:22,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 08:01:25,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:01:25,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:01:25,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:25,955 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=647460.0, ans=0.1 2023-09-30 08:01:27,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:01:27,856 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=647460.0, ans=0.125 2023-09-30 08:01:28,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:01:31,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:33,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=647526.6666666666, ans=0.025 2023-09-30 08:01:34,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=647526.6666666666, ans=6.0 2023-09-30 08:01:34,837 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.890e+02 2.087e+02 2.413e+02 4.629e+02, threshold=4.174e+02, percent-clipped=1.0 2023-09-30 08:01:36,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:36,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 08:01:38,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:01:38,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:01:40,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:42,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 08:01:47,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 08:01:50,097 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:50,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 08:01:51,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:01:52,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=647593.3333333334, ans=0.0 2023-09-30 08:01:53,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:01:54,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:54,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:01:56,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 08:01:58,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:01:58,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:01:59,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 08:02:00,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:02:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:02:07,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 08:02:12,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:02:12,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:02:17,784 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 08:02:19,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:19,151 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 08:02:19,355 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=647660.0, ans=0.125 2023-09-30 08:02:20,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:22,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:02:23,533 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 08:02:25,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:02:26,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 08:02:28,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:31,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:31,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:34,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:34,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:35,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:02:37,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 08:02:38,601 INFO [train.py:1039] (0/4) Epoch 19, batch 1550, loss[loss=0.2471, simple_loss=0.3044, pruned_loss=0.09492, over 19468.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2541, pruned_loss=0.05249, over 4702093.78 frames. ], batch size: 388, lr: 5.45e-03, grad_scale: 8.0 2023-09-30 08:02:38,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 08:02:38,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:02:40,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 08:02:40,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 08:02:43,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:43,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=647793.3333333334, ans=0.125 2023-09-30 08:02:44,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:46,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:02:46,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:02:47,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:47,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:51,429 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 08:02:51,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:51,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:02:53,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:02:55,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:02:55,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 08:02:55,868 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.43 vs. limit=15.0 2023-09-30 08:02:56,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:58,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 08:02:58,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 08:02:58,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 08:02:59,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:01,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:02,561 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.23 vs. limit=15.0 2023-09-30 08:03:03,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=647860.0, ans=0.05 2023-09-30 08:03:04,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:03:08,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 08:03:08,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 08:03:16,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:16,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=647926.6666666666, ans=0.1 2023-09-30 08:03:22,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:03:22,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:03:22,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:03:22,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 08:03:30,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:03:33,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:34,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:03:36,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:03:37,005 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.76 vs. limit=15.0 2023-09-30 08:03:37,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:37,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 08:03:37,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:03:40,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:03:40,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:42,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:03:42,546 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 08:03:46,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:03:51,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 08:03:57,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:03:59,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:59,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 08:04:01,500 INFO [train.py:1039] (0/4) Epoch 19, batch 1600, loss[loss=0.1643, simple_loss=0.2442, pruned_loss=0.04221, over 24522.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.255, pruned_loss=0.05238, over 4715974.85 frames. ], batch size: 63, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:04:03,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:04:04,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:04,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:04:04,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:04:04,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:04:08,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:09,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 08:04:11,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 08:04:13,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 08:04:16,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:17,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 08:04:19,704 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.864e+02 2.063e+02 2.300e+02 3.333e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-30 08:04:19,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:04:21,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:04:25,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=648193.3333333334, ans=0.125 2023-09-30 08:04:25,490 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=648193.3333333334, ans=0.125 2023-09-30 08:04:26,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:04:30,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 08:04:33,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:04:34,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 08:04:34,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:36,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 08:04:39,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 08:04:47,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:49,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 08:04:49,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:51,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:51,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:04:52,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:04:56,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:04:56,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:57,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:59,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:00,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:05:01,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:05:04,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:05:04,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:05:11,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:13,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:05:16,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 08:05:16,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:05:16,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 08:05:18,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=648393.3333333334, ans=0.125 2023-09-30 08:05:23,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:23,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:05:24,586 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.02 vs. limit=15.0 2023-09-30 08:05:25,221 INFO [train.py:1039] (0/4) Epoch 19, batch 1650, loss[loss=0.24, simple_loss=0.3014, pruned_loss=0.08928, over 19549.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2557, pruned_loss=0.05259, over 4722071.76 frames. ], batch size: 388, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:05:25,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:05:25,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 08:05:25,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 08:05:25,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 08:05:25,708 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=648460.0, ans=0.125 2023-09-30 08:05:26,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 08:05:30,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:31,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:31,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:05:31,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:05:33,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:36,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 08:05:39,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:05:39,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:39,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:05:39,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:05:42,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 08:05:42,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 08:05:48,755 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:05:50,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:05:58,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 08:06:01,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:03,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 08:06:08,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:09,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:06:11,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:06:11,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:12,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:06:13,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:16,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:16,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:18,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:18,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:20,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:20,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:06:20,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=648660.0, ans=0.1 2023-09-30 08:06:20,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=648660.0, ans=0.125 2023-09-30 08:06:21,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:22,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=648660.0, ans=0.125 2023-09-30 08:06:23,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 08:06:24,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:25,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 08:06:25,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 08:06:26,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 08:06:26,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:27,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:06:29,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:29,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:29,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 08:06:34,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:37,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:06:37,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:40,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 08:06:47,396 INFO [train.py:1039] (0/4) Epoch 19, batch 1700, loss[loss=0.1796, simple_loss=0.2631, pruned_loss=0.04802, over 24567.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2545, pruned_loss=0.05249, over 4714157.59 frames. ], batch size: 71, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:06:47,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:47,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:06:47,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 08:06:49,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:06:49,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:06:49,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:49,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=648793.3333333334, ans=0.125 2023-09-30 08:06:52,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:06:52,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:06:52,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 08:06:56,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:07:04,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:07:05,478 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.834e+02 2.122e+02 2.379e+02 4.054e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 08:07:05,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:07:11,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:07:12,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:12,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:07:14,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:15,503 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.61 vs. limit=15.0 2023-09-30 08:07:17,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 08:07:19,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:07:19,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:22,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:07:24,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:07:24,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=648926.6666666666, ans=0.125 2023-09-30 08:07:26,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 08:07:26,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 08:07:27,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:27,945 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=648926.6666666666, ans=0.1 2023-09-30 08:07:28,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=648926.6666666666, ans=0.0 2023-09-30 08:07:29,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 08:07:29,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:07:40,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:07:40,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:41,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:44,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:07:44,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 08:07:44,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:47,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:47,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 08:07:48,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:07:48,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:48,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:48,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:07:51,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:51,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:07:54,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:54,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:07:54,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:07:59,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:00,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 08:08:00,871 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.75 vs. limit=10.0 2023-09-30 08:08:02,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:03,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:06,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 08:08:09,788 INFO [train.py:1039] (0/4) Epoch 19, batch 1750, loss[loss=0.1691, simple_loss=0.2525, pruned_loss=0.04281, over 23425.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2539, pruned_loss=0.05177, over 4717553.37 frames. ], batch size: 93, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:08:10,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=649126.6666666666, ans=0.1 2023-09-30 08:08:11,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:14,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:14,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:08:14,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 08:08:16,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:08:19,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:08:19,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:23,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 08:08:26,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:29,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 08:08:29,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:08:31,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:08:34,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:08:35,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=649193.3333333334, ans=0.1 2023-09-30 08:08:36,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 08:08:39,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:08:39,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 08:08:50,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:08:52,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:08:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:55,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:55,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:58,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:00,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:02,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:02,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:09:04,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 08:09:06,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:09,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 08:09:11,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:11,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:12,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:09:14,745 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=649393.3333333334, ans=0.1 2023-09-30 08:09:17,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:09:18,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 08:09:18,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:20,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=649393.3333333334, ans=0.1 2023-09-30 08:09:22,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:25,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:27,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:29,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:09:30,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 08:09:30,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:31,821 INFO [train.py:1039] (0/4) Epoch 19, batch 1800, loss[loss=0.1724, simple_loss=0.2464, pruned_loss=0.04923, over 23555.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2538, pruned_loss=0.05166, over 4711504.38 frames. ], batch size: 134, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:09:32,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:09:32,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:32,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:09:32,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:09:32,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:09:37,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:09:39,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:40,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:09:43,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:47,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:09:47,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:47,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=649526.6666666666, ans=0.1 2023-09-30 08:09:49,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=649526.6666666666, ans=0.2 2023-09-30 08:09:50,074 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.833e+02 2.073e+02 2.384e+02 3.418e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:09:50,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:09:53,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:53,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:53,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=649526.6666666666, ans=0.1 2023-09-30 08:09:54,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:09:58,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:58,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 08:09:58,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:03,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:06,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 08:10:09,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 08:10:09,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 08:10:10,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:11,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:10:11,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:11,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:10:16,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=649593.3333333334, ans=0.0 2023-09-30 08:10:20,275 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 08:10:21,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:10:23,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:24,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 08:10:24,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 08:10:26,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:10:26,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:10:28,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:10:33,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 08:10:39,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:10:39,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=649726.6666666666, ans=0.05 2023-09-30 08:10:41,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 08:10:41,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:10:42,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:42,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:10:44,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 08:10:47,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:10:47,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:10:48,312 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=649726.6666666666, ans=0.2 2023-09-30 08:10:51,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 08:10:51,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:53,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:10:54,645 INFO [train.py:1039] (0/4) Epoch 19, batch 1850, loss[loss=0.1727, simple_loss=0.2597, pruned_loss=0.0429, over 24434.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2536, pruned_loss=0.05145, over 4720619.84 frames. ], batch size: 69, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:10:54,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:10:54,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:56,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:56,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:10:59,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:59,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:01,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:11:01,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:01,793 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=649793.3333333334, ans=0.125 2023-09-30 08:11:11,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:11:11,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 08:11:11,528 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=649860.0, ans=0.0 2023-09-30 08:11:15,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 08:11:17,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 08:11:22,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:22,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 08:11:22,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 08:11:32,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:11:34,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 08:11:38,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:11:38,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:11:44,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 08:11:45,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:45,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:11:47,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:11:49,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:50,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:54,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:11:54,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:56,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:11:56,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:57,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:11:59,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:12:02,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 08:12:02,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:12:07,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:12:07,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:12:07,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 08:12:07,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 08:12:09,563 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 08:12:11,070 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 08:12:12,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:12:12,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:12:12,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:12,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:12,808 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 08:12:12,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:12:12,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:14,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:12:16,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:12:16,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:12:16,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 08:12:18,008 INFO [train.py:1039] (0/4) Epoch 19, batch 1900, loss[loss=0.1987, simple_loss=0.2832, pruned_loss=0.05715, over 24057.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2543, pruned_loss=0.05196, over 4707398.96 frames. ], batch size: 80, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:12:19,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:19,660 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 08:12:19,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:12:21,956 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.73 vs. limit=15.0 2023-09-30 08:12:22,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:29,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:32,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:12:34,261 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 08:12:35,688 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.831e+02 2.039e+02 2.265e+02 4.223e+02, threshold=4.078e+02, percent-clipped=1.0 2023-09-30 08:12:35,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 08:12:36,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:37,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:12:37,556 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 08:12:37,610 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 08:12:41,429 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=650193.3333333334, ans=0.125 2023-09-30 08:12:42,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 08:12:44,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:12:48,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 08:12:52,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 08:12:59,465 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.30 vs. limit=15.0 2023-09-30 08:13:00,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 08:13:05,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 08:13:05,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:05,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 08:13:05,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 08:13:05,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 08:13:07,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 08:13:07,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:13:09,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=650326.6666666666, ans=0.125 2023-09-30 08:13:09,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=650326.6666666666, ans=0.0 2023-09-30 08:13:10,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 08:13:14,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:13:15,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:15,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 08:13:20,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:13:22,587 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=650393.3333333334, ans=0.125 2023-09-30 08:13:23,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 08:13:23,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:30,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:13:30,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:13:30,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:13:32,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:13:33,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:13:33,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:13:34,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:13:38,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:38,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:13:40,087 INFO [train.py:1039] (0/4) Epoch 19, batch 1950, loss[loss=0.1955, simple_loss=0.2636, pruned_loss=0.0637, over 23704.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2554, pruned_loss=0.05291, over 4694773.89 frames. ], batch size: 232, lr: 5.44e-03, grad_scale: 8.0 2023-09-30 08:13:41,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:13:41,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:41,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:43,354 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.20 vs. limit=15.0 2023-09-30 08:13:43,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:48,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:13:49,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:13:49,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:49,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:13:53,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 08:13:55,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:13:55,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:55,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:56,858 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=650526.6666666666, ans=0.125 2023-09-30 08:13:58,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:13:58,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:13:58,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:02,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:03,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:14:03,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:14:03,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:14:03,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:05,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=650526.6666666666, ans=0.2 2023-09-30 08:14:06,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:11,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:14:11,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:12,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:14:12,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 08:14:12,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:14:13,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:14:13,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:18,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:20,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:14:25,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:14:28,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:14:28,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:14:30,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 08:14:30,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:14:32,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=650660.0, ans=0.0 2023-09-30 08:14:33,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:35,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:14:35,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:14:41,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:43,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:45,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:49,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:52,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:14:53,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:54,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 08:14:54,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:14:55,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:57,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 08:14:57,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:15:03,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:15:04,375 INFO [train.py:1039] (0/4) Epoch 19, batch 2000, loss[loss=0.2452, simple_loss=0.3008, pruned_loss=0.09477, over 19358.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2564, pruned_loss=0.05385, over 4674292.91 frames. ], batch size: 388, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:15:04,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:15:04,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:05,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:15:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:10,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 08:15:12,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:15:16,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:15:18,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 08:15:18,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:15:18,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:15:21,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=650860.0, ans=0.05 2023-09-30 08:15:22,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:15:23,986 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.096e+02 2.439e+02 2.971e+02 4.515e+02, threshold=4.878e+02, percent-clipped=2.0 2023-09-30 08:15:24,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 08:15:25,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:27,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:28,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:30,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 08:15:30,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:15:32,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 08:15:32,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:36,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:15:39,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:15:39,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:39,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:42,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:15:42,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 08:15:42,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=650926.6666666666, ans=0.1 2023-09-30 08:15:45,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 08:15:45,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:45,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:15:51,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:51,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:15:51,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:52,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:56,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:57,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:58,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:58,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:16:00,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:04,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:16:04,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 08:16:05,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=650993.3333333334, ans=0.1 2023-09-30 08:16:09,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:16:09,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:13,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:13,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:16:18,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:21,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:16:21,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:16:23,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:23,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=651060.0, ans=0.0 2023-09-30 08:16:25,900 INFO [train.py:1039] (0/4) Epoch 19, batch 2050, loss[loss=0.163, simple_loss=0.243, pruned_loss=0.04149, over 24682.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2554, pruned_loss=0.0533, over 4696731.90 frames. ], batch size: 65, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:16:25,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:30,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:30,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:35,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:16:38,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:16:40,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:40,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:16:43,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 08:16:43,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:16:44,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:16:44,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:16:53,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:16:53,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:56,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 08:16:59,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:17:01,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 08:17:01,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:17:06,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:06,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:08,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:17:08,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:10,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:17:10,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=651260.0, ans=0.125 2023-09-30 08:17:12,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:17:12,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:17:15,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:17,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:17:20,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:17:22,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:17:26,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:31,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:17:33,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 08:17:35,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=651393.3333333334, ans=10.0 2023-09-30 08:17:40,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:40,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:17:43,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:17:45,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 08:17:47,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=651393.3333333334, ans=0.125 2023-09-30 08:17:50,115 INFO [train.py:1039] (0/4) Epoch 19, batch 2100, loss[loss=0.1669, simple_loss=0.2456, pruned_loss=0.04408, over 24462.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2536, pruned_loss=0.05252, over 4711687.67 frames. ], batch size: 63, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:17:50,304 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 08:17:50,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:17:50,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:51,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:17:53,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:53,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 08:17:53,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 08:17:53,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=651460.0, ans=0.2 2023-09-30 08:17:54,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:58,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:17:58,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:18:00,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:00,411 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=651460.0, ans=0.125 2023-09-30 08:18:01,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:18:01,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 08:18:03,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:18:04,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 08:18:04,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 08:18:05,005 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=651526.6666666666, ans=0.125 2023-09-30 08:18:05,043 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=651526.6666666666, ans=0.1 2023-09-30 08:18:06,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:07,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:07,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 08:18:07,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 08:18:09,217 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.858e+02 2.144e+02 2.529e+02 4.189e+02, threshold=4.288e+02, percent-clipped=0.0 2023-09-30 08:18:13,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 08:18:13,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:18:14,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=651526.6666666666, ans=0.125 2023-09-30 08:18:16,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:18:16,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:18:19,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:18:21,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 08:18:21,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:21,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 08:18:24,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 08:18:26,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:26,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 08:18:26,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 08:18:26,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=651593.3333333334, ans=0.125 2023-09-30 08:18:28,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 08:18:29,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:18:32,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:18:36,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:36,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:37,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:39,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:39,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 08:18:39,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:40,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:40,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:40,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 08:18:42,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 08:18:42,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=651660.0, ans=0.125 2023-09-30 08:18:43,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 08:18:46,385 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:18:47,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:18:50,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=651660.0, ans=0.125 2023-09-30 08:18:52,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:52,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 08:18:58,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:01,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:19:03,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:03,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:03,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:19:03,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:04,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:04,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:19:05,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:19:05,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:08,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 08:19:09,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 08:19:09,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:12,615 INFO [train.py:1039] (0/4) Epoch 19, batch 2150, loss[loss=0.1668, simple_loss=0.237, pruned_loss=0.04827, over 23645.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2525, pruned_loss=0.05169, over 4714962.74 frames. ], batch size: 232, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:19:12,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:19:12,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:19:12,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:19:12,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:19:19,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:19:22,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:22,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:24,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:19:24,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:24,686 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.74 vs. limit=15.0 2023-09-30 08:19:25,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:19:27,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:27,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:19:27,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:19:32,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:32,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 08:19:39,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:41,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:19:41,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:42,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:19:44,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:44,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:44,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:45,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 08:19:47,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:19:48,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:48,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:50,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:52,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:19:53,063 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.08 vs. limit=15.0 2023-09-30 08:19:55,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:55,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:19:55,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=651926.6666666666, ans=0.1 2023-09-30 08:19:57,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:57,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 08:19:57,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:20:02,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:02,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:03,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:04,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:20:05,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:07,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:07,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 08:20:08,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 08:20:09,340 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.40 vs. limit=15.0 2023-09-30 08:20:10,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:20:10,793 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 08:20:10,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:11,077 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=651993.3333333334, ans=0.1 2023-09-30 08:20:12,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:20:12,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 08:20:12,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:20:12,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 08:20:13,817 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 08:20:13,817 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 08:20:13,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 08:20:16,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:16,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:20:16,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:20:17,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=652060.0, ans=0.1 2023-09-30 08:20:18,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:20,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:20:21,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:21,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:29,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:20:31,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 08:20:34,262 INFO [train.py:1039] (0/4) Epoch 19, batch 2200, loss[loss=0.1736, simple_loss=0.2545, pruned_loss=0.04631, over 24484.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2528, pruned_loss=0.05159, over 4722074.51 frames. ], batch size: 63, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:20:35,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:20:42,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:43,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:20:44,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:20:44,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:20:46,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:46,996 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=652126.6666666666, ans=0.0 2023-09-30 08:20:48,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:20:48,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 08:20:53,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 08:20:53,338 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=652193.3333333334, ans=0.0 2023-09-30 08:20:54,315 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.914e+02 2.227e+02 2.792e+02 4.256e+02, threshold=4.455e+02, percent-clipped=0.0 2023-09-30 08:20:55,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:21:01,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 08:21:02,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:03,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:04,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:21:06,479 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=652260.0, ans=0.125 2023-09-30 08:21:09,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:21:09,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 08:21:12,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:21:14,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:14,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 08:21:19,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:21:19,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:21,448 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=652260.0, ans=0.0 2023-09-30 08:21:22,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:21:22,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:26,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 08:21:26,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=652326.6666666666, ans=0.1 2023-09-30 08:21:27,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:27,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=652326.6666666666, ans=0.125 2023-09-30 08:21:28,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 08:21:32,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:32,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:21:32,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:34,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:34,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:35,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:35,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:37,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:21:37,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:21:40,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:21:42,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:21:42,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:21:45,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:21:47,408 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 08:21:49,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:21:49,086 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 08:21:51,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:21:52,939 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 08:21:54,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:54,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:21:56,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:57,560 INFO [train.py:1039] (0/4) Epoch 19, batch 2250, loss[loss=0.18, simple_loss=0.2528, pruned_loss=0.05359, over 23791.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2533, pruned_loss=0.05167, over 4716110.09 frames. ], batch size: 212, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:21:59,375 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 08:22:01,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:22:02,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:06,070 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=652460.0, ans=0.0 2023-09-30 08:22:07,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:22:09,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:22:11,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=652460.0, ans=0.125 2023-09-30 08:22:12,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:12,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:14,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:15,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 08:22:16,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:16,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:22:19,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 08:22:19,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=652526.6666666666, ans=0.0 2023-09-30 08:22:21,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:22:21,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:22,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:28,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:28,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=652526.6666666666, ans=0.125 2023-09-30 08:22:30,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:22:31,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:22:31,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 08:22:33,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:34,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:22:38,614 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=652593.3333333334, ans=0.0 2023-09-30 08:22:39,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:41,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:43,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:22:43,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:47,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:50,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:22:55,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:22:58,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:23:06,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:23:06,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:23:06,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:23:11,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:23:16,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:23:16,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 08:23:16,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:16,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:23:19,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 08:23:20,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=652793.3333333334, ans=6.0 2023-09-30 08:23:21,046 INFO [train.py:1039] (0/4) Epoch 19, batch 2300, loss[loss=0.1956, simple_loss=0.2758, pruned_loss=0.05769, over 24442.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2543, pruned_loss=0.05192, over 4716042.89 frames. ], batch size: 77, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:23:22,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:23:22,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:28,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:29,027 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:23:32,512 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 08:23:34,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:40,222 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.880e+02 2.084e+02 2.491e+02 4.260e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 08:23:42,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:23:42,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:23:42,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:23:42,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:42,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 08:23:44,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:23:47,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:23:47,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:23:51,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:23:54,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=652926.6666666666, ans=0.125 2023-09-30 08:23:55,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:23:58,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:23:59,447 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=652926.6666666666, ans=0.0 2023-09-30 08:24:03,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:24:03,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:24:07,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:24:07,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:24:10,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:24:11,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:24:11,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:24:11,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 08:24:17,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=652993.3333333334, ans=0.0 2023-09-30 08:24:18,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:24:18,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:18,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:24:18,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:24:18,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:20,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:24:20,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:24:20,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 08:24:21,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:24:21,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:21,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 08:24:29,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:24:32,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:24:36,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:36,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:24:38,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:24:38,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:24:39,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:24:39,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:24:41,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 08:24:42,931 INFO [train.py:1039] (0/4) Epoch 19, batch 2350, loss[loss=0.1598, simple_loss=0.2354, pruned_loss=0.0421, over 24424.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2556, pruned_loss=0.05266, over 4714912.90 frames. ], batch size: 58, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:24:46,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:24:46,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 08:24:48,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=653126.6666666666, ans=0.0 2023-09-30 08:24:54,474 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-09-30 08:24:55,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 08:24:57,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:25:00,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:00,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:01,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:01,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:03,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 08:25:07,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:25:13,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 08:25:14,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:16,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:25:16,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:25:19,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:25:21,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 08:25:21,601 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=653260.0, ans=0.125 2023-09-30 08:25:22,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:25:22,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:22,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:23,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:25:27,326 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.31 vs. limit=10.0 2023-09-30 08:25:28,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:25:30,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 08:25:32,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:25:35,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:35,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:25:38,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 08:25:38,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:25:42,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 08:25:43,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:25:45,189 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=653326.6666666666, ans=0.125 2023-09-30 08:25:48,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 08:25:54,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 08:25:55,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:55,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 08:25:55,584 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 08:25:55,612 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 08:25:57,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 08:25:59,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:26:04,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:26:06,055 INFO [train.py:1039] (0/4) Epoch 19, batch 2400, loss[loss=0.1942, simple_loss=0.258, pruned_loss=0.06521, over 23752.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2552, pruned_loss=0.05279, over 4702531.93 frames. ], batch size: 164, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:26:10,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:26:13,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:26:13,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 08:26:13,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 08:26:19,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:26:19,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:26:22,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 08:26:22,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:26:22,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:24,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 08:26:25,937 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.937e+02 2.110e+02 2.328e+02 3.835e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-30 08:26:29,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:30,421 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.72 vs. limit=22.5 2023-09-30 08:26:32,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 08:26:35,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=653526.6666666666, ans=0.2 2023-09-30 08:26:37,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:26:43,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 08:26:44,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:26:47,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:53,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:26:54,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 08:26:54,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:26:59,572 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=653660.0, ans=0.2 2023-09-30 08:27:02,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:04,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:07,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:08,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:27:08,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:27:08,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:27:08,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:11,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:11,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:27:17,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:27:18,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:27:18,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 08:27:20,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 08:27:22,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:27:23,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:23,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 08:27:23,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 08:27:23,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 08:27:23,209 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 08:27:25,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 08:27:26,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:27:29,760 INFO [train.py:1039] (0/4) Epoch 19, batch 2450, loss[loss=0.1864, simple_loss=0.2694, pruned_loss=0.05174, over 23991.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2538, pruned_loss=0.0521, over 4696481.30 frames. ], batch size: 80, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:27:29,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:29,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:31,402 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 08:27:31,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:32,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:27:35,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:27:35,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:39,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:39,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:40,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 08:27:45,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:45,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:51,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:27:51,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:27:51,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:27:52,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 08:27:57,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:58,469 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.81 vs. limit=12.0 2023-09-30 08:27:59,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:27:59,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:28:04,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:28:04,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:06,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:07,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:28:09,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 08:28:10,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:28:11,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=653926.6666666666, ans=0.125 2023-09-30 08:28:18,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:20,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:28:20,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:20,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:28:20,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:24,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:28:24,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 08:28:28,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:28,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:28:31,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:28:31,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:37,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:28:37,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 08:28:39,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:28:39,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:28:39,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 08:28:41,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:28:41,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:28:41,465 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=654060.0, ans=0.0 2023-09-30 08:28:47,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:28:50,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:50,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:28:51,552 INFO [train.py:1039] (0/4) Epoch 19, batch 2500, loss[loss=0.171, simple_loss=0.2523, pruned_loss=0.04491, over 24548.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2523, pruned_loss=0.05151, over 4698036.80 frames. ], batch size: 71, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:28:53,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 08:28:55,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:28:57,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=654126.6666666666, ans=0.125 2023-09-30 08:29:00,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=654126.6666666666, ans=10.0 2023-09-30 08:29:01,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:08,499 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.88 vs. limit=15.0 2023-09-30 08:29:09,532 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=654193.3333333334, ans=0.04949747468305833 2023-09-30 08:29:11,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:29:11,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:29:12,730 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.785e+02 1.970e+02 2.171e+02 3.825e+02, threshold=3.939e+02, percent-clipped=0.0 2023-09-30 08:29:12,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:12,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 08:29:20,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:29:22,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:29:22,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:29:22,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:29:22,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 08:29:25,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:25,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:25,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 08:29:25,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:26,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 08:29:26,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:32,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:29:32,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:36,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:29:36,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 08:29:36,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:29:38,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:41,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:47,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:52,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:29:56,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:29:59,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 08:29:59,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:00,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:04,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:30:04,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:30:04,601 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 08:30:04,602 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 08:30:04,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 08:30:09,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:11,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 08:30:11,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 08:30:11,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:30:12,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 08:30:15,571 INFO [train.py:1039] (0/4) Epoch 19, batch 2550, loss[loss=0.1976, simple_loss=0.2643, pruned_loss=0.06541, over 23761.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2524, pruned_loss=0.05158, over 4694151.30 frames. ], batch size: 179, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:30:15,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 08:30:17,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:20,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:30:20,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:30:23,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:25,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 08:30:25,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:30:30,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 08:30:31,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:30:33,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:35,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:36,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 08:30:36,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:30:37,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:37,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:39,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:30:41,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 08:30:41,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:41,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:41,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 08:30:51,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:30:58,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:30:58,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:58,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:59,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:30:59,829 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.60 vs. limit=15.0 2023-09-30 08:31:05,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:31:05,809 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=654660.0, ans=0.125 2023-09-30 08:31:08,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:31:08,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:31:10,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:31:10,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:31:10,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:31:14,671 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.66 vs. limit=15.0 2023-09-30 08:31:15,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:15,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:23,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:31:23,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 08:31:23,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:31:23,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:25,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:31:25,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:31:26,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:34,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:31:35,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:36,024 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=654793.3333333334, ans=0.0 2023-09-30 08:31:37,206 INFO [train.py:1039] (0/4) Epoch 19, batch 2600, loss[loss=0.1664, simple_loss=0.2355, pruned_loss=0.04862, over 24384.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2538, pruned_loss=0.05218, over 4698345.83 frames. ], batch size: 56, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:31:37,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=654793.3333333334, ans=0.2 2023-09-30 08:31:40,309 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 08:31:43,932 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 08:31:43,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:31:44,022 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 08:31:44,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 08:31:45,533 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 08:31:47,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=654793.3333333334, ans=0.125 2023-09-30 08:31:48,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:48,626 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 08:31:50,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 08:31:50,843 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 08:31:54,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:31:54,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 08:31:56,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 08:31:57,364 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.901e+02 2.112e+02 2.544e+02 3.828e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-30 08:31:57,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:31:59,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 08:32:02,628 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 08:32:03,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 08:32:10,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:11,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:11,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:11,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 08:32:13,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=654926.6666666666, ans=0.05 2023-09-30 08:32:14,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:32:19,957 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 08:32:23,310 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:32:26,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:26,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:28,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 08:32:28,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:28,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:29,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 08:32:31,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:32:33,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:32:34,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:37,935 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 08:32:39,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:39,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:32:44,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:45,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:32:45,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 08:32:45,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:47,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:32:48,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:32:55,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 08:32:56,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:58,321 INFO [train.py:1039] (0/4) Epoch 19, batch 2650, loss[loss=0.1523, simple_loss=0.2273, pruned_loss=0.03865, over 24442.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2546, pruned_loss=0.05246, over 4703509.73 frames. ], batch size: 58, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:32:58,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:33:02,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 08:33:03,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:03,157 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=655126.6666666666, ans=0.125 2023-09-30 08:33:04,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:33:06,494 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 08:33:06,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:08,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:12,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:33:14,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:33:15,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:33:17,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 08:33:17,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:33:17,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:33:20,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 08:33:23,374 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 08:33:24,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:27,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 08:33:27,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:28,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 08:33:30,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:30,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:33:30,727 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:32,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:36,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 08:33:37,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 08:33:39,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:33:42,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 08:33:42,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:44,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:44,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:33:45,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:47,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:50,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:51,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:33:52,393 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.17 vs. limit=22.5 2023-09-30 08:33:53,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:53,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:33:54,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:33:57,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:58,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:34:00,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:00,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=655326.6666666666, ans=0.0 2023-09-30 08:34:01,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:02,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:34:06,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:06,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:34:06,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:08,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 08:34:12,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:34:13,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:15,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:15,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:15,959 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=655393.3333333334, ans=0.0 2023-09-30 08:34:16,333 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.20 vs. limit=6.0 2023-09-30 08:34:17,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:34:17,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:20,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:20,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 08:34:21,738 INFO [train.py:1039] (0/4) Epoch 19, batch 2700, loss[loss=0.1826, simple_loss=0.2438, pruned_loss=0.06065, over 23459.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2553, pruned_loss=0.05254, over 4709141.57 frames. ], batch size: 256, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:34:21,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:34:23,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:34:23,770 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=655460.0, ans=0.2 2023-09-30 08:34:25,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:34:25,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:25,574 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.08 vs. limit=10.0 2023-09-30 08:34:26,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:28,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:34:28,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:28,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:34:28,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:34:29,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 08:34:30,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:34:32,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:34:33,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:34:35,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:39,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:34:41,039 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.869e+02 2.051e+02 2.484e+02 4.492e+02, threshold=4.101e+02, percent-clipped=1.0 2023-09-30 08:34:41,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 08:34:41,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:34:46,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:34:46,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:34:53,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:34:53,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:53,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:34:53,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:34:57,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:59,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:00,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:35:00,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:05,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:05,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:35:14,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:35:16,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:35:21,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:35:21,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:26,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:26,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:28,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:28,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:29,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:29,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:35:34,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:35:34,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:34,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:37,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 08:35:37,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=655726.6666666666, ans=0.95 2023-09-30 08:35:40,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:40,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:35:40,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 08:35:41,460 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.05 vs. limit=22.5 2023-09-30 08:35:42,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 08:35:42,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:43,948 INFO [train.py:1039] (0/4) Epoch 19, batch 2750, loss[loss=0.1685, simple_loss=0.2299, pruned_loss=0.05354, over 23502.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2546, pruned_loss=0.05226, over 4712815.38 frames. ], batch size: 285, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:35:46,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:35:47,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:49,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:49,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:35:51,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:53,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:35:53,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=655793.3333333334, ans=0.125 2023-09-30 08:35:54,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:35:54,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:35:54,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:54,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 08:35:54,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:54,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:58,415 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=655793.3333333334, ans=0.0 2023-09-30 08:35:59,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 08:36:01,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:36:02,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:03,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:04,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:36:04,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:36:06,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:36:07,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:08,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:12,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:36:12,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:36:13,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:36:15,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:16,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:36:24,548 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.22 vs. limit=22.5 2023-09-30 08:36:25,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:27,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:36:27,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:32,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=655993.3333333334, ans=0.1 2023-09-30 08:36:35,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:35,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:36:35,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:36:40,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=655993.3333333334, ans=0.0 2023-09-30 08:36:43,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:36:44,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:44,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 08:36:47,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:49,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 08:36:53,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:36:57,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:36:59,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 08:36:59,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:00,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=656060.0, ans=0.125 2023-09-30 08:37:00,436 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.94 vs. limit=15.0 2023-09-30 08:37:01,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:37:02,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 08:37:03,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:37:06,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=656126.6666666666, ans=0.07 2023-09-30 08:37:07,202 INFO [train.py:1039] (0/4) Epoch 19, batch 2800, loss[loss=0.1928, simple_loss=0.2766, pruned_loss=0.05452, over 24066.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2546, pruned_loss=0.05159, over 4732897.27 frames. ], batch size: 80, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:37:07,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:37:07,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:08,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:08,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 08:37:08,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:10,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:11,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:11,954 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 08:37:11,955 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 08:37:15,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:16,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:37:16,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:37:20,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:37:21,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 08:37:23,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:37:23,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 08:37:25,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:26,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:37:26,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:28,349 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.835e+02 2.025e+02 2.355e+02 3.473e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 08:37:32,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:32,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:32,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:37:32,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:37:41,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:37:42,561 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:45,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:47,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:47,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:52,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:37:52,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 08:37:52,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:52,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:52,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:37:58,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:58,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:59,134 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.67 vs. limit=15.0 2023-09-30 08:38:02,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:38:05,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:38:05,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:05,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:38:06,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:38:06,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:38:08,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:38:08,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 08:38:08,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:10,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:38:10,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:11,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 08:38:11,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:11,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:38:13,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:38:14,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 08:38:19,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:38:19,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:38:21,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:38:22,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=656393.3333333334, ans=10.0 2023-09-30 08:38:24,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:27,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:38:27,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:38:28,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:38:30,212 INFO [train.py:1039] (0/4) Epoch 19, batch 2850, loss[loss=0.1503, simple_loss=0.231, pruned_loss=0.03482, over 24320.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2525, pruned_loss=0.05135, over 4712668.89 frames. ], batch size: 61, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:38:31,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:31,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:36,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:38:36,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 08:38:44,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 08:38:44,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:44,880 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.56 vs. limit=22.5 2023-09-30 08:38:45,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 08:38:45,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:48,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 08:38:50,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 08:38:51,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:04,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:05,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:05,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:39:07,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:39:07,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:39:07,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:39:07,776 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.81 vs. limit=22.5 2023-09-30 08:39:09,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:39:10,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 08:39:14,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:39:14,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:14,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:14,603 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=656593.3333333334, ans=0.2 2023-09-30 08:39:16,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:19,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:19,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:20,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:21,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=656660.0, ans=0.125 2023-09-30 08:39:22,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:22,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:39:24,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:24,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:25,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:39:31,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:39:33,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 08:39:33,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 08:39:35,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:39:35,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:35,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 08:39:36,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:39:38,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:38,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:39,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:39:39,482 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 08:39:39,548 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 08:39:39,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:39:39,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:46,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:39:46,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:48,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:49,159 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:39:50,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 08:39:51,776 INFO [train.py:1039] (0/4) Epoch 19, batch 2900, loss[loss=0.1883, simple_loss=0.275, pruned_loss=0.05079, over 24576.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2533, pruned_loss=0.0514, over 4720907.82 frames. ], batch size: 71, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:39:53,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:54,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 08:39:55,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 08:39:56,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:39:56,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:39:56,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=656793.3333333334, ans=0.125 2023-09-30 08:39:59,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:01,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:40:04,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:40:04,523 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=656793.3333333334, ans=0.025 2023-09-30 08:40:05,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:40:08,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:40:08,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 08:40:10,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:40:11,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:13,062 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.848e+02 2.073e+02 2.444e+02 4.000e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:40:14,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 08:40:16,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 08:40:17,348 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.13 vs. limit=15.0 2023-09-30 08:40:19,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:40:19,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 08:40:19,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:40:24,144 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:40:24,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:40:27,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:29,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:32,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:40:35,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:40:37,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 08:40:37,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 08:40:37,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:40:41,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:40:44,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 08:40:46,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:40:51,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:41:00,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:41:01,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:41:01,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 08:41:05,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:05,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 08:41:05,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:05,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:41:05,483 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=657060.0, ans=0.2 2023-09-30 08:41:12,554 INFO [train.py:1039] (0/4) Epoch 19, batch 2950, loss[loss=0.1776, simple_loss=0.2508, pruned_loss=0.05218, over 23736.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2539, pruned_loss=0.0517, over 4722468.21 frames. ], batch size: 212, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:41:12,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:14,634 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=657126.6666666666, ans=0.2 2023-09-30 08:41:15,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 08:41:17,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:17,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:18,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:41:20,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:41:21,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 08:41:23,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 08:41:23,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:41:23,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:30,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:33,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:41:35,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:41:35,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:39,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:41:40,391 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.02 vs. limit=15.0 2023-09-30 08:41:41,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:41:42,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:41:48,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 08:41:53,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 08:41:53,526 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 08:41:54,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:41:56,526 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 08:41:58,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 08:41:58,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:58,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:58,265 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 08:41:58,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:42:00,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=657326.6666666666, ans=0.2 2023-09-30 08:42:01,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 08:42:03,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:42:03,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:42:05,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:07,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:42:07,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:07,194 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 08:42:07,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:07,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 08:42:15,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:16,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=657326.6666666666, ans=0.2 2023-09-30 08:42:17,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:18,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 08:42:18,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:42:20,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 08:42:22,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:23,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:42:25,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:42:26,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:26,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:42:28,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:42:29,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:29,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:42:29,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:42:31,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:31,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:42:33,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:34,279 INFO [train.py:1039] (0/4) Epoch 19, batch 3000, loss[loss=0.178, simple_loss=0.2733, pruned_loss=0.04132, over 24344.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2547, pruned_loss=0.0516, over 4729708.11 frames. ], batch size: 74, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:42:34,280 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 08:42:48,931 INFO [train.py:1071] (0/4) Epoch 19, validation: loss=0.3515, simple_loss=0.275, pruned_loss=0.214, over 1125622.00 frames. 2023-09-30 08:42:48,932 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 08:42:49,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 08:42:50,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:52,226 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:42:52,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:42:55,304 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 08:42:55,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 08:42:57,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:57,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:42:57,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=657460.0, ans=0.1 2023-09-30 08:42:58,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 08:42:58,990 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.21 vs. limit=15.0 2023-09-30 08:42:59,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:00,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=657460.0, ans=0.125 2023-09-30 08:43:04,520 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=657526.6666666666, ans=0.1 2023-09-30 08:43:05,033 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.36 vs. limit=10.0 2023-09-30 08:43:07,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:43:07,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=657526.6666666666, ans=0.2 2023-09-30 08:43:11,618 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.829e+02 2.117e+02 2.474e+02 3.888e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 08:43:16,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:43:25,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 08:43:25,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:43:27,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:43:27,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:28,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:43:30,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:30,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 08:43:33,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 08:43:35,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:43:35,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:43:37,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:43:37,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:38,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:38,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:43:38,762 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=657660.0, ans=0.125 2023-09-30 08:43:43,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:43:43,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:43,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:43:45,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:49,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 08:43:50,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:43:50,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:43:50,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:43:55,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:55,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:59,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:43:59,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 08:43:59,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:43:59,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 08:44:00,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:44:02,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 08:44:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:04,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:44:05,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 08:44:05,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 08:44:05,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:44:07,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:44:07,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:44:07,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:44:07,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:09,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:44:12,053 INFO [train.py:1039] (0/4) Epoch 19, batch 3050, loss[loss=0.1903, simple_loss=0.2589, pruned_loss=0.06084, over 23737.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2557, pruned_loss=0.05195, over 4720096.62 frames. ], batch size: 232, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:44:13,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 08:44:13,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=657793.3333333334, ans=0.125 2023-09-30 08:44:15,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:16,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:16,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:44:20,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:22,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 08:44:32,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 08:44:32,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 08:44:32,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:36,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:44:38,717 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=657860.0, ans=0.95 2023-09-30 08:44:41,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:42,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:43,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:43,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=657926.6666666666, ans=0.0 2023-09-30 08:44:46,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:44:46,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:46,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:48,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:48,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:48,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:50,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:53,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:53,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 08:44:53,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:53,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:44:55,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=657926.6666666666, ans=0.0 2023-09-30 08:44:58,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:58,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:44:59,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:00,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:06,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:45:07,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:16,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:16,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:45:16,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:45:19,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:19,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:45:19,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:45:21,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 08:45:22,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:22,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:24,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 08:45:27,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:32,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:34,308 INFO [train.py:1039] (0/4) Epoch 19, batch 3100, loss[loss=0.1519, simple_loss=0.2285, pruned_loss=0.03763, over 24372.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2551, pruned_loss=0.05183, over 4716220.71 frames. ], batch size: 56, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:45:34,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:45:36,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:45:38,306 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=658126.6666666666, ans=0.125 2023-09-30 08:45:39,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 08:45:42,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 08:45:42,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 08:45:44,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:45:47,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:47,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:52,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:45:55,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:56,633 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.813e+02 2.094e+02 2.454e+02 3.292e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 08:46:01,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 08:46:05,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:46:05,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:07,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:07,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:46:09,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:46:10,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:46:10,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 08:46:10,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:46:12,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:12,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 08:46:14,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:46:20,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:46:20,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 08:46:20,982 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=658260.0, ans=0.04949747468305833 2023-09-30 08:46:22,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 08:46:23,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:25,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:26,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:26,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:26,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:46:28,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:46:28,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:46:30,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:46:30,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:46:30,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:30,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 08:46:34,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:35,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=658326.6666666666, ans=0.0 2023-09-30 08:46:36,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 08:46:40,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:46:42,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 08:46:42,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:42,633 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:46:43,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:43,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 08:46:44,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=658393.3333333334, ans=0.125 2023-09-30 08:46:45,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=658393.3333333334, ans=0.2 2023-09-30 08:46:54,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 08:46:56,193 INFO [train.py:1039] (0/4) Epoch 19, batch 3150, loss[loss=0.165, simple_loss=0.2443, pruned_loss=0.04287, over 21962.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2535, pruned_loss=0.05162, over 4699356.64 frames. ], batch size: 48, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:46:58,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:46:59,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:00,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=658460.0, ans=0.125 2023-09-30 08:47:00,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=658460.0, ans=0.0 2023-09-30 08:47:01,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:47:01,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:47:01,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 08:47:03,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:03,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:47:04,332 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.51 vs. limit=12.0 2023-09-30 08:47:04,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 08:47:06,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:10,819 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 08:47:11,450 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.74 vs. limit=15.0 2023-09-30 08:47:13,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 08:47:13,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:47:15,487 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 08:47:15,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:47:18,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 08:47:18,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 08:47:18,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 08:47:19,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:19,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:20,545 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:22,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 08:47:23,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:26,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:47:30,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 08:47:32,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:47:32,551 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=658593.3333333334, ans=0.05 2023-09-30 08:47:33,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:47:33,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:35,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 08:47:36,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 08:47:38,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:47:38,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 08:47:39,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:47:39,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:39,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:47:44,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:47:44,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:47:45,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 08:47:47,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:47:47,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:49,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:47:49,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:49,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 08:47:49,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:49,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=658660.0, ans=0.1 2023-09-30 08:47:51,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 08:47:52,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:52,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 08:47:54,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 08:47:55,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:47:55,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:56,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=658660.0, ans=0.2 2023-09-30 08:47:57,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 08:48:00,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:48:00,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:48:00,613 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=658660.0, ans=0.1 2023-09-30 08:48:03,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:48:05,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:06,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:48:10,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=658726.6666666666, ans=0.5 2023-09-30 08:48:11,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:48:11,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:14,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:48:20,487 INFO [train.py:1039] (0/4) Epoch 19, batch 3200, loss[loss=0.141, simple_loss=0.2174, pruned_loss=0.03234, over 24333.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2528, pruned_loss=0.05152, over 4707541.37 frames. ], batch size: 56, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:48:20,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:48:20,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:48:24,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:24,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:48:24,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 08:48:27,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:48:32,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:48:32,631 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=2.145e-02 2023-09-30 08:48:35,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:43,537 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.911e+02 2.162e+02 2.460e+02 4.180e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-30 08:48:43,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:48:56,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 08:48:57,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:48:59,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=658926.6666666666, ans=0.125 2023-09-30 08:49:00,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 08:49:02,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:49:05,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:49:05,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:49:06,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:49:09,382 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.08 vs. limit=10.0 2023-09-30 08:49:11,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 08:49:13,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:49:13,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=658993.3333333334, ans=0.2 2023-09-30 08:49:14,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 08:49:16,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 08:49:20,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:49:26,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:26,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:49:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:28,538 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 08:49:28,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:49:33,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:49:35,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 08:49:36,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 08:49:36,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 08:49:38,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 08:49:41,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:49:42,869 INFO [train.py:1039] (0/4) Epoch 19, batch 3250, loss[loss=0.1982, simple_loss=0.2825, pruned_loss=0.05696, over 24034.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2528, pruned_loss=0.05113, over 4715193.58 frames. ], batch size: 80, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:49:43,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:49:44,488 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 08:49:44,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:49:44,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:49:46,070 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 08:49:50,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:49:54,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:49:59,466 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=659193.3333333334, ans=0.125 2023-09-30 08:50:04,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:04,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 08:50:05,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:05,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:50:05,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:05,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=659193.3333333334, ans=0.125 2023-09-30 08:50:07,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:07,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:50:10,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:11,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:50:11,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:12,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:15,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:17,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:18,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:18,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:20,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:20,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:20,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:26,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 08:50:26,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:50:26,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:50:28,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:30,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:50:37,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:50:45,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:50:47,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:47,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 08:50:47,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:50:47,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:50:47,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:48,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 08:50:50,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 08:50:50,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:51,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:53,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:53,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:50:53,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:58,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:58,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:01,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 08:51:01,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:01,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=659393.3333333334, ans=0.0 2023-09-30 08:51:04,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:51:04,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 08:51:05,721 INFO [train.py:1039] (0/4) Epoch 19, batch 3300, loss[loss=0.1512, simple_loss=0.2251, pruned_loss=0.03871, over 24325.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2535, pruned_loss=0.0516, over 4704470.62 frames. ], batch size: 56, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:51:07,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:51:07,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 08:51:08,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 08:51:10,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 08:51:10,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:10,783 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=659460.0, ans=0.125 2023-09-30 08:51:16,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:18,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:51:19,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:19,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:51:22,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:51:25,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:25,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:51:27,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=659526.6666666666, ans=0.2 2023-09-30 08:51:28,250 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.763e+02 1.973e+02 2.210e+02 4.562e+02, threshold=3.946e+02, percent-clipped=1.0 2023-09-30 08:51:29,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 08:51:29,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:51:30,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:32,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:32,906 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 08:51:34,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=659526.6666666666, ans=0.2 2023-09-30 08:51:35,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:51:37,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:51:37,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:51:37,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:51:38,687 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 08:51:43,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:43,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:51:44,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:44,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 08:51:46,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 08:51:46,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:47,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:51:50,099 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 08:51:51,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 08:51:51,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:51:54,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 08:51:54,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:51:59,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:52:01,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:02,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:02,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:02,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:52:02,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:52:06,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:52:06,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:06,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:52:07,788 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 08:52:09,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 08:52:11,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:52:11,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:11,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:13,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:13,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:15,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:52:16,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:16,777 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:52:16,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:19,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:52:23,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 08:52:23,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:25,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:25,493 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=22.5 2023-09-30 08:52:26,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:52:28,519 INFO [train.py:1039] (0/4) Epoch 19, batch 3350, loss[loss=0.1978, simple_loss=0.2665, pruned_loss=0.06461, over 23843.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2551, pruned_loss=0.05199, over 4707300.74 frames. ], batch size: 195, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:52:28,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:52:28,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:30,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:30,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:33,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:33,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:34,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:52:36,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:38,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:52:39,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:41,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:52:43,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 08:52:44,935 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 08:52:45,174 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=659860.0, ans=0.0 2023-09-30 08:52:45,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=659860.0, ans=0.1 2023-09-30 08:52:47,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:48,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 08:52:48,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 08:52:50,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:52:50,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:52:51,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:51,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 08:52:54,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:54,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:52:54,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=659860.0, ans=0.125 2023-09-30 08:52:57,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:58,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:58,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:00,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:53:03,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:03,506 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:53:06,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:07,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:11,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:53:12,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:14,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:16,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:18,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:20,279 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:53:21,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 08:53:21,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:53:21,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 08:53:21,633 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:53:21,972 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=659993.3333333334, ans=0.2 2023-09-30 08:53:23,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 08:53:25,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:28,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:34,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:34,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 08:53:35,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:53:37,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:53:38,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=660060.0, ans=0.125 2023-09-30 08:53:39,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:53:44,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:53:46,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 08:53:46,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:53:47,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:53:49,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:49,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 08:53:49,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:49,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 08:53:51,458 INFO [train.py:1039] (0/4) Epoch 19, batch 3400, loss[loss=0.1924, simple_loss=0.256, pruned_loss=0.06438, over 23730.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2554, pruned_loss=0.05221, over 4703083.10 frames. ], batch size: 232, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:53:51,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:51,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:53,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:53:54,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:53:54,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 08:54:01,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 08:54:01,331 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 08:54:01,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:05,743 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.41 vs. limit=12.0 2023-09-30 08:54:06,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:54:06,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:54:06,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:07,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:54:14,448 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.902e+02 2.101e+02 2.445e+02 3.700e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-30 08:54:14,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:16,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 08:54:20,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:54:23,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:23,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:23,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:54:31,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:54:36,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 08:54:42,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:42,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:43,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 08:54:43,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:54:45,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:46,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:48,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:54:50,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:50,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=660326.6666666666, ans=0.125 2023-09-30 08:54:53,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:54:53,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:55:00,538 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:03,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 08:55:08,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:55:08,700 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.66 vs. limit=22.5 2023-09-30 08:55:11,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 08:55:13,170 INFO [train.py:1039] (0/4) Epoch 19, batch 3450, loss[loss=0.1627, simple_loss=0.2379, pruned_loss=0.04374, over 21065.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2545, pruned_loss=0.05205, over 4703878.15 frames. ], batch size: 46, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:55:16,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 08:55:18,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:55:19,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:55:19,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 08:55:21,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:23,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=660460.0, ans=0.125 2023-09-30 08:55:25,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:55:29,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:55:31,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:31,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:55:31,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:32,351 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.39 vs. limit=22.5 2023-09-30 08:55:34,142 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=14.40 vs. limit=15.0 2023-09-30 08:55:35,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:41,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 08:55:48,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 08:55:48,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:55:48,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:55:51,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:56,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 08:55:56,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:55:57,185 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:56:01,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:01,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:56:02,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:56:04,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:56:06,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 08:56:06,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:08,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:56:11,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:14,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 08:56:18,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:56:22,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:56:24,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:27,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:32,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:32,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:34,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:56:34,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:35,712 INFO [train.py:1039] (0/4) Epoch 19, batch 3500, loss[loss=0.1589, simple_loss=0.2079, pruned_loss=0.05498, over 19416.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2532, pruned_loss=0.0511, over 4722168.18 frames. ], batch size: 388, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:56:38,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:42,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:56:42,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 08:56:45,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:56:48,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 08:56:51,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:52,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=660860.0, ans=0.0 2023-09-30 08:56:53,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 08:56:57,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:56:58,824 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.797e+02 1.956e+02 2.209e+02 3.007e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 08:56:59,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:59,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:57:01,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:01,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:57:01,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:01,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:01,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 08:57:04,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:05,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:57:07,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:12,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:12,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 08:57:13,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:15,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:18,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:57:20,245 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:21,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:57:21,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:23,358 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 08:57:23,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 08:57:25,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 08:57:25,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:26,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:28,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:28,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:57:31,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:57:33,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:57:37,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:57:39,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 08:57:40,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 08:57:40,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:57:41,889 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.whiten.whitening_limit, batch_count=661060.0, ans=12.0 2023-09-30 08:57:42,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:42,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:45,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:49,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 08:57:50,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:52,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:53,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 08:57:56,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 08:57:56,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:58,224 INFO [train.py:1039] (0/4) Epoch 19, batch 3550, loss[loss=0.1723, simple_loss=0.2416, pruned_loss=0.05149, over 23675.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2515, pruned_loss=0.05041, over 4730095.64 frames. ], batch size: 232, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:57:58,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:58,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:57:58,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=661126.6666666666, ans=0.0 2023-09-30 08:58:00,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:03,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:58:14,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:15,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:58:20,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:20,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:58:22,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:22,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:58:23,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:58:25,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:27,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:58:27,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:27,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:58:28,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:58:34,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:58:34,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:36,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:36,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:38,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:58:38,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 08:58:38,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:40,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:41,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:58:43,880 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=661260.0, ans=0.2 2023-09-30 08:58:49,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:49,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:49,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:50,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 08:58:52,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:58:52,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 08:58:52,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:55,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:58:55,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:59:00,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 08:59:00,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:06,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:06,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 08:59:07,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:11,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:59:12,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 08:59:17,011 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-09-30 08:59:21,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 08:59:21,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:59:21,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:59:22,705 INFO [train.py:1039] (0/4) Epoch 19, batch 3600, loss[loss=0.1829, simple_loss=0.2565, pruned_loss=0.05471, over 24323.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.251, pruned_loss=0.05014, over 4723927.95 frames. ], batch size: 61, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:59:22,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:24,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:24,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:59:28,482 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.66 vs. limit=12.0 2023-09-30 08:59:29,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:31,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:32,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:59:32,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:59:34,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:34,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 08:59:35,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:59:37,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:37,489 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=661526.6666666666, ans=0.025 2023-09-30 08:59:41,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:43,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:45,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:59:45,578 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:45,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 08:59:46,963 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.820e+02 2.003e+02 2.240e+02 3.370e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-30 08:59:47,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:50,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:50,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:59:53,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:55,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:56,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:59:57,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 09:00:05,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:06,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:00:06,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 09:00:08,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=661593.3333333334, ans=0.05 2023-09-30 09:00:12,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:00:15,065 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.48 vs. limit=15.0 2023-09-30 09:00:19,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:21,763 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=661660.0, ans=0.125 2023-09-30 09:00:22,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:28,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:00:29,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:00:29,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 09:00:29,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=661726.6666666666, ans=0.125 2023-09-30 09:00:31,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 09:00:32,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 09:00:34,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:00:34,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:00:35,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 09:00:37,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:00:37,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:00:37,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:39,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 09:00:39,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 09:00:44,171 INFO [train.py:1039] (0/4) Epoch 19, batch 3650, loss[loss=0.1888, simple_loss=0.2713, pruned_loss=0.05313, over 24634.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.252, pruned_loss=0.05006, over 4726517.17 frames. ], batch size: 68, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 09:00:44,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:45,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=661793.3333333334, ans=22.5 2023-09-30 09:00:45,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 09:00:49,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 09:00:52,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:00:57,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 09:00:59,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 09:01:02,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:02,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:01:03,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:01:06,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:01:06,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:01:07,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 09:01:07,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:01:07,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:09,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 09:01:10,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=661860.0, ans=0.125 2023-09-30 09:01:11,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:01:12,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:12,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:14,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:01:17,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 09:01:19,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 09:01:20,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:01:22,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 09:01:23,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:25,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:01:28,269 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.61 vs. limit=15.0 2023-09-30 09:01:29,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:01:32,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:32,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:01:34,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:01:34,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:01:35,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:01:40,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:42,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:42,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:42,956 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.05 vs. limit=15.0 2023-09-30 09:01:44,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:01:46,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:46,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:50,999 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 09:01:55,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:55,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:56,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:01:58,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:00,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:02:00,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:03,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 09:02:03,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:06,513 INFO [train.py:1039] (0/4) Epoch 19, batch 3700, loss[loss=0.188, simple_loss=0.2615, pruned_loss=0.0572, over 23616.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.253, pruned_loss=0.05071, over 4722517.06 frames. ], batch size: 120, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:02:06,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:02:10,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:02:10,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:02:13,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:13,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 09:02:13,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:14,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:02:14,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:02:16,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:02:19,297 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=662126.6666666666, ans=0.0 2023-09-30 09:02:22,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:02:22,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:23,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:02:23,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:25,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:02:28,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:28,296 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 09:02:31,273 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.890e+02 2.038e+02 2.335e+02 3.154e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-30 09:02:38,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:02:38,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:02:39,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:02:39,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 09:02:39,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:40,640 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.41 vs. limit=12.0 2023-09-30 09:02:43,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:45,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 09:02:46,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:48,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:02:51,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:51,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:02:55,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:02:58,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:58,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 09:03:00,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:00,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 09:03:05,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:03:06,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:03:09,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:09,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 09:03:11,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:03:11,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:03:11,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:13,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:18,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:18,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=662393.3333333334, ans=0.125 2023-09-30 09:03:19,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 09:03:21,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 09:03:22,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:03:22,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:24,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:03:25,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:03:28,138 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.48 vs. limit=12.0 2023-09-30 09:03:29,202 INFO [train.py:1039] (0/4) Epoch 19, batch 3750, loss[loss=0.1699, simple_loss=0.2513, pruned_loss=0.04425, over 24453.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2543, pruned_loss=0.05122, over 4709956.82 frames. ], batch size: 66, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:03:29,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:03:31,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:03:32,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:03:32,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 09:03:34,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:03:34,817 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=662460.0, ans=0.1 2023-09-30 09:03:36,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:03:37,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 09:03:39,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:03:40,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:40,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:42,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:03:47,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:50,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:03:50,806 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=662526.6666666666, ans=0.0 2023-09-30 09:03:52,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:03:55,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:58,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:03:58,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 09:04:00,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:01,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:01,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:04:04,181 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=662593.3333333334, ans=0.1 2023-09-30 09:04:05,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 09:04:09,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 09:04:10,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:11,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:11,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=662593.3333333334, ans=0.1 2023-09-30 09:04:13,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:04:13,980 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.93 vs. limit=6.0 2023-09-30 09:04:15,920 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-09-30 09:04:16,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:19,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:04:20,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 09:04:25,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:27,889 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=12.0 2023-09-30 09:04:28,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:04:28,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:04:33,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:04:37,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:04:39,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:04:42,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:04:42,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:04:45,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:04:50,707 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.35 vs. limit=22.5 2023-09-30 09:04:51,927 INFO [train.py:1039] (0/4) Epoch 19, batch 3800, loss[loss=0.1925, simple_loss=0.2654, pruned_loss=0.05984, over 23484.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2542, pruned_loss=0.05138, over 4711818.44 frames. ], batch size: 119, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:04:55,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:04:56,998 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=662793.3333333334, ans=0.0 2023-09-30 09:05:01,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:01,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:05:03,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 09:05:04,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:04,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:06,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:05:08,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:05:08,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:10,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:05:11,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:13,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:05:13,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:15,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 09:05:18,094 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.822e+02 1.936e+02 2.155e+02 2.834e+02, threshold=3.873e+02, percent-clipped=0.0 2023-09-30 09:05:19,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:05:21,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:05:23,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:24,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:05:26,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:05:28,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:05:28,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:30,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:31,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:36,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:05:36,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 09:05:39,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:05:46,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:05:50,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=662993.3333333334, ans=0.1 2023-09-30 09:05:51,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=662993.3333333334, ans=0.0 2023-09-30 09:05:53,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:05:56,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 09:05:57,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 09:05:59,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:00,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:06:00,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:04,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 09:06:06,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=663060.0, ans=0.125 2023-09-30 09:06:07,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 09:06:07,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 09:06:07,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:07,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=663060.0, ans=0.125 2023-09-30 09:06:09,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:06:13,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:06:14,484 INFO [train.py:1039] (0/4) Epoch 19, batch 3850, loss[loss=0.2104, simple_loss=0.277, pruned_loss=0.07183, over 23223.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2536, pruned_loss=0.05114, over 4712162.66 frames. ], batch size: 105, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:06:14,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:06:19,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:06:21,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 09:06:22,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:06:23,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:26,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:06:28,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:31,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:06:32,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 09:06:36,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=663193.3333333334, ans=0.015 2023-09-30 09:06:37,061 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.70 vs. limit=6.0 2023-09-30 09:06:39,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:41,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:43,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:43,555 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=663193.3333333334, ans=0.2 2023-09-30 09:06:44,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:06:46,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:47,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:06:50,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:50,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:06:50,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:53,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:53,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:54,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:06:54,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 09:06:54,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 09:06:56,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:56,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:59,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:00,002 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=663260.0, ans=0.0 2023-09-30 09:07:01,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:02,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 09:07:05,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 09:07:07,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:08,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 09:07:12,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:07:14,619 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.57 vs. limit=22.5 2023-09-30 09:07:15,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:17,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:22,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:22,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 09:07:26,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 09:07:27,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:29,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:31,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:07:31,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:07:31,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:07:33,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 09:07:34,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:07:36,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 09:07:36,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:36,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:37,755 INFO [train.py:1039] (0/4) Epoch 19, batch 3900, loss[loss=0.161, simple_loss=0.2279, pruned_loss=0.04708, over 23405.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2523, pruned_loss=0.05095, over 4705136.72 frames. ], batch size: 285, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:07:37,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:07:39,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:40,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:07:42,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:42,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:43,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:07:43,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 09:07:43,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:47,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:49,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:51,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:07:52,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:55,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:55,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:57,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:07:59,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 09:07:59,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:02,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 09:08:02,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:08:02,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 09:08:03,990 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.880e+02 2.094e+02 2.304e+02 3.533e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 09:08:04,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 09:08:05,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=663526.6666666666, ans=0.0 2023-09-30 09:08:10,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:10,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:08:12,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:08:12,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:15,987 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=663593.3333333334, ans=0.2 2023-09-30 09:08:17,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:18,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:08:22,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:08:22,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:23,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:08:29,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:29,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:08:36,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:08:37,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:08:49,257 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:08:49,986 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.15 vs. limit=15.0 2023-09-30 09:08:53,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:53,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 09:08:53,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 09:08:53,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:54,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=663726.6666666666, ans=0.0 2023-09-30 09:08:55,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 09:08:57,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:57,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=663726.6666666666, ans=0.2 2023-09-30 09:08:58,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 09:09:00,551 INFO [train.py:1039] (0/4) Epoch 19, batch 3950, loss[loss=0.1751, simple_loss=0.2557, pruned_loss=0.04722, over 24073.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2517, pruned_loss=0.05055, over 4710786.11 frames. ], batch size: 80, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:09:04,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:09:05,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 09:09:06,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:09:08,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:09:09,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:09:16,011 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 09:09:17,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:18,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 09:09:19,471 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 09:09:19,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:09:21,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:22,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:09:22,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:24,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 09:09:27,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:09:27,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:27,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:09:29,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:09:29,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:09:33,619 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.28 vs. limit=22.5 2023-09-30 09:09:36,971 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=663926.6666666666, ans=0.125 2023-09-30 09:09:43,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:09:43,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:09:50,070 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=663993.3333333334, ans=0.2 2023-09-30 09:09:51,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 09:09:52,003 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=663993.3333333334, ans=0.0 2023-09-30 09:09:58,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 09:09:58,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 09:09:58,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:09:58,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=663993.3333333334, ans=0.125 2023-09-30 09:10:00,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:10:06,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:10:06,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:10:08,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:08,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:10:08,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 09:10:08,832 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=664060.0, ans=0.2 2023-09-30 09:10:14,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:10:16,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:10:18,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=664060.0, ans=0.125 2023-09-30 09:10:19,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 09:10:24,393 INFO [train.py:1039] (0/4) Epoch 19, batch 4000, loss[loss=0.1992, simple_loss=0.2653, pruned_loss=0.06653, over 23685.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2525, pruned_loss=0.05085, over 4705857.39 frames. ], batch size: 179, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:10:26,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=664126.6666666666, ans=0.0 2023-09-30 09:10:31,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:34,597 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=664126.6666666666, ans=0.125 2023-09-30 09:10:38,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:42,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:42,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:10:44,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:44,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 09:10:45,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:10:45,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 09:10:45,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:10:45,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 09:10:48,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:51,345 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.842e+02 2.148e+02 2.341e+02 3.331e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-30 09:10:53,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:10:53,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:10:53,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:10:53,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:53,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:10:56,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:10:57,701 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 09:10:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:10:59,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:02,434 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 09:11:03,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:11:03,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:09,323 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 09:11:09,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:11:12,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:11:14,368 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 09:11:15,863 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:11:17,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 09:11:17,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:11:18,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:19,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:11:20,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:11:20,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:11:20,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:24,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 09:11:24,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:24,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=664326.6666666666, ans=0.2 2023-09-30 09:11:25,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=664326.6666666666, ans=0.1 2023-09-30 09:11:27,589 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 09:11:30,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:11:34,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:11:37,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:11:37,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:38,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:11:40,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:11:45,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:45,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=664460.0, ans=0.125 2023-09-30 09:11:46,665 INFO [train.py:1039] (0/4) Epoch 19, batch 4050, loss[loss=0.1816, simple_loss=0.2577, pruned_loss=0.05281, over 23420.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.253, pruned_loss=0.05096, over 4711420.54 frames. ], batch size: 93, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:11:47,185 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=664460.0, ans=0.0 2023-09-30 09:11:48,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:11:48,524 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=664460.0, ans=0.07 2023-09-30 09:11:50,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 09:11:50,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=664460.0, ans=0.125 2023-09-30 09:11:51,158 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.47 vs. limit=6.0 2023-09-30 09:11:51,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:11:51,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:11:53,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:11:55,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:11:56,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:02,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:03,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:05,175 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:12:06,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:12:06,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:12:12,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:13,018 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=664526.6666666666, ans=0.2 2023-09-30 09:12:14,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:12:17,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 09:12:19,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 09:12:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 09:12:22,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:12:27,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 09:12:29,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:12:32,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:38,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:38,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:12:38,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:41,572 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=664660.0, ans=0.125 2023-09-30 09:12:42,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:45,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 09:12:47,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:12:47,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:12:49,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 09:12:51,368 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.78 vs. limit=15.0 2023-09-30 09:12:52,410 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=664726.6666666666, ans=0.05 2023-09-30 09:12:55,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:13:03,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 09:13:05,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:05,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:13:07,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 09:13:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 09:13:07,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:09,194 INFO [train.py:1039] (0/4) Epoch 19, batch 4100, loss[loss=0.1762, simple_loss=0.2515, pruned_loss=0.05041, over 23330.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.254, pruned_loss=0.05148, over 4722932.25 frames. ], batch size: 119, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:13:09,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:09,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=664793.3333333334, ans=0.125 2023-09-30 09:13:11,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:11,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:13:16,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 09:13:16,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 09:13:16,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=664793.3333333334, ans=0.125 2023-09-30 09:13:19,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 09:13:20,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 09:13:20,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:21,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:13:22,532 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 09:13:27,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:29,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:13:29,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:29,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:13:33,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:13:34,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:34,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:13:34,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 09:13:36,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:36,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:13:36,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:36,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:13:38,155 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.904e+02 2.164e+02 2.650e+02 3.755e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-30 09:13:38,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 09:13:41,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:13:43,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 09:13:45,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:48,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:48,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 09:13:48,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:49,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:13:50,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:13:51,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 09:13:53,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:13:54,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:13:57,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 09:13:57,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:57,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:00,874 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.43 vs. limit=15.0 2023-09-30 09:14:01,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:07,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:10,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:11,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:14:19,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:19,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:22,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:23,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:14:28,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:30,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:14:31,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:14:31,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:33,078 INFO [train.py:1039] (0/4) Epoch 19, batch 4150, loss[loss=0.1899, simple_loss=0.273, pruned_loss=0.05338, over 24340.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2542, pruned_loss=0.05205, over 4707640.05 frames. ], batch size: 77, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:14:34,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 09:14:34,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:36,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 09:14:37,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 09:14:37,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 09:14:39,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:39,661 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=665126.6666666666, ans=0.125 2023-09-30 09:14:45,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:14:45,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:50,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:14:52,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:14:52,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:14:55,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:14:55,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:55,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=665193.3333333334, ans=0.2 2023-09-30 09:14:56,911 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:15:02,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:08,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:08,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 09:15:08,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=665260.0, ans=0.2 2023-09-30 09:15:11,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 09:15:11,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:15:12,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 09:15:12,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:15:12,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:15,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:15,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:22,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 09:15:25,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:28,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:15:28,280 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 09:15:29,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:31,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 09:15:31,482 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=665326.6666666666, ans=0.1 2023-09-30 09:15:34,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:15:34,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:35,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:37,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 09:15:37,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:15:38,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:15:39,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:15:41,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 09:15:42,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:42,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:15:42,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:15:44,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 09:15:44,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:45,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:15:45,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:48,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:50,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 09:15:50,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:50,992 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.56 vs. limit=15.0 2023-09-30 09:15:52,445 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.32 vs. limit=15.0 2023-09-30 09:15:55,501 INFO [train.py:1039] (0/4) Epoch 19, batch 4200, loss[loss=0.1956, simple_loss=0.2723, pruned_loss=0.0595, over 23316.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2526, pruned_loss=0.05162, over 4703353.87 frames. ], batch size: 93, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:15:55,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:15:55,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 09:15:58,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:16:00,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:02,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:16:02,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:02,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:05,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 09:16:08,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 09:16:08,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:11,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:15,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:16:18,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:16:19,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:19,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:21,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 09:16:21,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:22,942 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.443e+02 1.959e+02 2.201e+02 2.604e+02 4.093e+02, threshold=4.401e+02, percent-clipped=0.0 2023-09-30 09:16:23,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:23,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:23,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:16:25,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:16:27,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 09:16:27,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:32,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:16:33,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:16:34,105 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=665593.3333333334, ans=0.125 2023-09-30 09:16:37,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:16:38,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:16:40,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:16:40,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 09:16:40,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:16:42,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:16:48,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:16:49,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:52,090 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=665660.0, ans=0.125 2023-09-30 09:16:54,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:16:58,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 09:16:58,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=665660.0, ans=0.1 2023-09-30 09:17:01,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:06,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:17:08,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:10,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 09:17:16,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:17:17,657 INFO [train.py:1039] (0/4) Epoch 19, batch 4250, loss[loss=0.1781, simple_loss=0.2424, pruned_loss=0.05696, over 23539.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2519, pruned_loss=0.05085, over 4713953.02 frames. ], batch size: 256, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:17:19,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:17:19,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:17:22,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:27,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:17:29,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 09:17:29,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:17:32,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:35,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=665860.0, ans=0.2 2023-09-30 09:17:36,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:40,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:40,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:43,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:17:43,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:17:46,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:48,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:49,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:51,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:17:52,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:54,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 09:17:54,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=665926.6666666666, ans=0.125 2023-09-30 09:17:58,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 09:17:58,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:58,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:59,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:59,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=665926.6666666666, ans=0.1 2023-09-30 09:18:00,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:18:00,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:02,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:18:02,960 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=665926.6666666666, ans=0.5 2023-09-30 09:18:05,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:18:07,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:18:11,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:13,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:13,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 09:18:14,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:18:14,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 09:18:16,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:18:18,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=665993.3333333334, ans=0.1 2023-09-30 09:18:19,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:18:21,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:21,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:18:22,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 09:18:24,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:18:24,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:18:27,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:30,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:33,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:18:35,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:36,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:37,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:18:37,755 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=666060.0, ans=0.125 2023-09-30 09:18:38,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:18:38,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 09:18:41,240 INFO [train.py:1039] (0/4) Epoch 19, batch 4300, loss[loss=0.1871, simple_loss=0.2617, pruned_loss=0.0563, over 23568.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2515, pruned_loss=0.05063, over 4708443.47 frames. ], batch size: 106, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:18:41,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:44,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:46,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:18:51,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:55,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=666126.6666666666, ans=0.125 2023-09-30 09:18:58,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:58,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 09:18:59,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:19:01,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:19:01,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:19:01,376 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 09:19:04,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:19:06,841 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-09-30 09:19:07,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:08,727 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.820e+02 2.101e+02 2.491e+02 4.654e+02, threshold=4.202e+02, percent-clipped=1.0 2023-09-30 09:19:09,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 09:19:09,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:19:09,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 09:19:12,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:19:14,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:19:17,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:19:19,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:19:19,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:19:20,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:22,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:19:24,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 09:19:24,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 09:19:27,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:19:30,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:19:30,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:31,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 09:19:31,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 09:19:31,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 09:19:31,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:19:32,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 09:19:32,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 09:19:36,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:38,183 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 09:19:40,967 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:19:42,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:42,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:46,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 09:19:46,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:46,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:47,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:19:47,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:19:49,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:19:52,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:19:55,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:57,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:57,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:20:01,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=666460.0, ans=0.1 2023-09-30 09:20:02,593 INFO [train.py:1039] (0/4) Epoch 19, batch 4350, loss[loss=0.1889, simple_loss=0.2728, pruned_loss=0.05253, over 24312.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2524, pruned_loss=0.0508, over 4718385.21 frames. ], batch size: 74, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:20:02,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 09:20:02,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:20:05,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=666460.0, ans=0.025 2023-09-30 09:20:09,595 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.45 vs. limit=5.0 2023-09-30 09:20:10,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:13,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:16,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:20:16,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:20:20,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:20:24,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:27,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:20:27,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:20:30,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:20:33,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:20:35,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:20:39,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 09:20:41,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:42,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:47,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:49,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=666660.0, ans=0.0 2023-09-30 09:20:50,618 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-100000.pt 2023-09-30 09:20:53,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 09:20:55,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:20:55,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=666660.0, ans=0.0 2023-09-30 09:20:57,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:21:02,693 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 09:21:04,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:04,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:21:06,165 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 09:21:06,294 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 09:21:07,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:07,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:09,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:21:09,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:09,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:09,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:21:13,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 09:21:13,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:13,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:13,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:14,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 09:21:16,303 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 09:21:16,310 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 09:21:16,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 09:21:20,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:21:20,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:21:22,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:23,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:21:23,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 09:21:25,469 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 09:21:25,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:26,825 INFO [train.py:1039] (0/4) Epoch 19, batch 4400, loss[loss=0.1806, simple_loss=0.2524, pruned_loss=0.05441, over 23414.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2539, pruned_loss=0.05158, over 4708934.28 frames. ], batch size: 93, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:21:28,937 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:21:29,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:29,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:31,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:35,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 09:21:35,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 09:21:37,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 09:21:37,090 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 09:21:37,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:21:37,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:40,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 09:21:43,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:43,894 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.99 vs. limit=15.0 2023-09-30 09:21:45,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:45,226 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 09:21:49,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:49,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 09:21:49,751 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 09:21:54,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 09:21:54,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 09:21:54,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 09:21:54,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:55,241 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.86 vs. limit=10.0 2023-09-30 09:21:55,660 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.851e+02 2.031e+02 2.280e+02 3.356e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 09:21:55,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:00,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 09:22:00,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 09:22:00,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:00,941 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=666926.6666666666, ans=0.125 2023-09-30 09:22:02,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:22:02,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:02,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=666926.6666666666, ans=0.125 2023-09-30 09:22:03,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:03,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=666926.6666666666, ans=0.125 2023-09-30 09:22:05,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:05,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 09:22:06,764 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 09:22:09,117 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=666926.6666666666, ans=0.125 2023-09-30 09:22:10,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:12,067 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=666926.6666666666, ans=0.125 2023-09-30 09:22:16,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:17,903 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 09:22:22,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:22:24,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:29,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:22:29,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 09:22:29,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:22:29,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:22:29,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:22:30,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:22:31,219 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=667060.0, ans=0.0 2023-09-30 09:22:35,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 09:22:36,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 09:22:38,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 09:22:38,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:38,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 09:22:40,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:22:45,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:22:48,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 09:22:49,462 INFO [train.py:1039] (0/4) Epoch 19, batch 4450, loss[loss=0.1856, simple_loss=0.2461, pruned_loss=0.06253, over 23805.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2545, pruned_loss=0.05152, over 4720372.48 frames. ], batch size: 164, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:22:51,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:53,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:53,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:23:02,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:02,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:23:05,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:07,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:23:07,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=667193.3333333334, ans=0.125 2023-09-30 09:23:10,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:23:10,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:11,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 09:23:11,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:11,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:11,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:11,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:23:15,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:23:17,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=667193.3333333334, ans=0.0 2023-09-30 09:23:21,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:22,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:24,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:24,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:25,172 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:23:26,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:23:26,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=667260.0, ans=0.125 2023-09-30 09:23:30,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:23:31,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 09:23:31,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 09:23:31,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:23:34,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:35,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 09:23:40,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:23:41,910 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=667326.6666666666, ans=0.0 2023-09-30 09:23:43,804 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.20 vs. limit=10.0 2023-09-30 09:23:44,621 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:44,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 09:23:44,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:44,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:23:44,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:23:46,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:48,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:50,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=667326.6666666666, ans=0.0 2023-09-30 09:23:52,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:23:54,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 09:23:54,657 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=667393.3333333334, ans=0.125 2023-09-30 09:23:55,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:23:59,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:59,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:24:01,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:01,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:24:02,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:24:05,686 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.15 vs. limit=15.0 2023-09-30 09:24:06,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 09:24:08,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:24:10,378 INFO [train.py:1039] (0/4) Epoch 19, batch 4500, loss[loss=0.1875, simple_loss=0.2538, pruned_loss=0.06061, over 23626.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2549, pruned_loss=0.05193, over 4719049.17 frames. ], batch size: 149, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:24:12,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:13,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 09:24:13,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 09:24:14,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=667460.0, ans=0.125 2023-09-30 09:24:16,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:21,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:23,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:23,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:24:25,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:24:25,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:25,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:25,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=667526.6666666666, ans=0.1 2023-09-30 09:24:40,000 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.864e+02 2.108e+02 2.361e+02 3.088e+02, threshold=4.216e+02, percent-clipped=0.0 2023-09-30 09:24:40,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:40,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=667526.6666666666, ans=0.125 2023-09-30 09:24:41,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:24:43,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:24:43,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:24:44,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:24:51,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:24:54,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:24:59,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:25:02,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:25:02,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 09:25:04,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:04,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:06,396 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=667660.0, ans=0.125 2023-09-30 09:25:08,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:08,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:25:11,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:25:11,342 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 09:25:11,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:25:11,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:17,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:25:17,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:25:19,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=667726.6666666666, ans=0.125 2023-09-30 09:25:20,572 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:22,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:25:22,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:25:23,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 09:25:26,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 09:25:26,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 09:25:31,106 INFO [train.py:1039] (0/4) Epoch 19, batch 4550, loss[loss=0.1698, simple_loss=0.2228, pruned_loss=0.0584, over 19414.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2529, pruned_loss=0.05119, over 4709843.83 frames. ], batch size: 388, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:25:31,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 09:25:33,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 09:25:35,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:38,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:40,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:42,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:45,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:25:47,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:48,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:25:48,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:25:48,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:51,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:51,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:54,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:25:56,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 09:25:56,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=667860.0, ans=0.1 2023-09-30 09:25:58,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 09:25:58,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:25:59,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 09:26:05,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 09:26:07,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:08,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=667926.6666666666, ans=0.125 2023-09-30 09:26:10,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 09:26:13,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:26:15,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:26:18,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 09:26:20,396 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:26:21,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:24,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:24,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:26,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:27,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 09:26:28,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 09:26:28,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:26:29,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 09:26:32,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 09:26:32,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:34,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:35,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:35,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:35,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:26:38,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:26:38,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 09:26:40,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:40,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:26:41,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=668060.0, ans=0.125 2023-09-30 09:26:42,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 09:26:42,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:26:42,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 09:26:45,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:26:45,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:26:47,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:26:49,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:49,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:26:49,418 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=668060.0, ans=0.125 2023-09-30 09:26:50,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:26:53,377 INFO [train.py:1039] (0/4) Epoch 19, batch 4600, loss[loss=0.1789, simple_loss=0.2634, pruned_loss=0.04719, over 24343.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2521, pruned_loss=0.05082, over 4717418.75 frames. ], batch size: 74, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:26:53,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:26:56,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:56,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:58,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:26:58,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:26:59,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:01,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 09:27:03,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:27:07,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:27:09,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:10,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:18,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 09:27:18,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:22,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:25,423 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.849e+02 2.100e+02 2.657e+02 4.568e+02, threshold=4.200e+02, percent-clipped=2.0 2023-09-30 09:27:27,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:27:27,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:32,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 09:27:32,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:27:33,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:27:38,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:38,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:27:40,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:27:45,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 09:27:47,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:27:52,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:53,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:27:55,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:55,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 09:27:56,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:58,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 09:27:58,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:00,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:01,766 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:01,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:01,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:03,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 09:28:04,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 09:28:04,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 09:28:04,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:06,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:07,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:08,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:16,077 INFO [train.py:1039] (0/4) Epoch 19, batch 4650, loss[loss=0.1889, simple_loss=0.2779, pruned_loss=0.04992, over 24309.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2514, pruned_loss=0.05015, over 4723786.34 frames. ], batch size: 74, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:28:19,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:28:21,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=668460.0, ans=0.125 2023-09-30 09:28:22,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:22,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:24,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:28:24,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:24,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:24,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:29,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 09:28:31,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:28:32,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=668526.6666666666, ans=0.0 2023-09-30 09:28:34,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 09:28:34,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:36,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 09:28:36,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:28:36,427 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 09:28:37,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 09:28:37,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:39,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:28:42,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:28:43,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:43,911 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 09:28:46,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:48,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 09:28:52,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:52,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:28:53,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 09:28:53,977 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:28:55,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:57,425 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=668593.3333333334, ans=0.0 2023-09-30 09:28:58,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:29:02,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:07,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:09,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=668660.0, ans=0.125 2023-09-30 09:29:10,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:10,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:10,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=668660.0, ans=0.2 2023-09-30 09:29:10,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=668660.0, ans=0.1 2023-09-30 09:29:12,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:29:15,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 09:29:16,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 09:29:16,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 09:29:16,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 09:29:17,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=668660.0, ans=0.0 2023-09-30 09:29:18,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:25,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:29:25,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:26,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 09:29:26,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:26,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:26,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:29:30,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:29:31,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:29:31,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:33,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:34,359 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=668726.6666666666, ans=0.125 2023-09-30 09:29:34,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=668726.6666666666, ans=0.0 2023-09-30 09:29:38,514 INFO [train.py:1039] (0/4) Epoch 19, batch 4700, loss[loss=0.1966, simple_loss=0.2617, pruned_loss=0.06572, over 23838.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2511, pruned_loss=0.04995, over 4723900.79 frames. ], batch size: 179, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:29:38,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:38,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:29:38,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:29:38,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 09:29:40,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:29:41,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 09:29:50,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=668793.3333333334, ans=0.5 2023-09-30 09:29:51,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:51,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:52,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:29:52,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:55,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:29:59,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=668860.0, ans=0.125 2023-09-30 09:30:00,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 09:30:02,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 09:30:03,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:05,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:30:05,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:30:09,364 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.848e+02 1.982e+02 2.168e+02 3.287e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 09:30:11,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:15,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:30:17,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:30:21,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:30:27,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 09:30:28,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:30:31,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:35,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 09:30:37,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:30:41,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:30:42,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 09:30:42,793 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=669060.0, ans=0.1 2023-09-30 09:30:42,838 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=669060.0, ans=0.125 2023-09-30 09:30:44,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:44,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:44,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=669060.0, ans=0.125 2023-09-30 09:30:47,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:48,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:30:49,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 09:30:49,132 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 09:30:50,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:52,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 09:30:52,503 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=669060.0, ans=0.125 2023-09-30 09:30:53,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:58,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 09:31:00,084 INFO [train.py:1039] (0/4) Epoch 19, batch 4750, loss[loss=0.1949, simple_loss=0.261, pruned_loss=0.06441, over 23642.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2517, pruned_loss=0.0506, over 4722139.85 frames. ], batch size: 232, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:31:01,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:31:03,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:31:09,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 09:31:10,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:15,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 09:31:16,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:31:16,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:31:16,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:24,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 09:31:27,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=669193.3333333334, ans=0.0 2023-09-30 09:31:28,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:31:31,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 09:31:31,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:34,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:34,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:34,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:36,511 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 09:31:36,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 09:31:43,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 09:31:46,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:49,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:31:51,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:31:51,541 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 09:31:51,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:31:53,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:31:56,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:31:58,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 09:31:58,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 09:31:58,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:58,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:31:58,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:00,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:32:01,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 09:32:03,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=669326.6666666666, ans=0.125 2023-09-30 09:32:05,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 09:32:09,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:11,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:32:11,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 09:32:11,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:13,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:16,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:32:16,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:18,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:32:20,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=669393.3333333334, ans=0.0 2023-09-30 09:32:20,091 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:32:21,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:21,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 09:32:21,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 09:32:23,297 INFO [train.py:1039] (0/4) Epoch 19, batch 4800, loss[loss=0.1568, simple_loss=0.2417, pruned_loss=0.03598, over 24643.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2531, pruned_loss=0.05118, over 4724205.26 frames. ], batch size: 68, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:32:23,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 09:32:26,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:32:26,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:26,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=669460.0, ans=0.125 2023-09-30 09:32:28,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 09:32:28,281 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=669460.0, ans=0.125 2023-09-30 09:32:34,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:36,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:40,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:32:41,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff3.min_abs, batch_count=669526.6666666666, ans=0.2 2023-09-30 09:32:42,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:43,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:43,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 09:32:44,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:44,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:32:44,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:32:51,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:32:52,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:52,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:32:54,170 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.880e+02 2.098e+02 2.403e+02 3.356e+02, threshold=4.196e+02, percent-clipped=0.0 2023-09-30 09:32:54,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:55,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:32:55,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:55,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:58,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:59,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:33:04,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:33:06,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:07,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 09:33:07,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 09:33:09,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:09,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:33:09,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:33:09,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:09,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:33:12,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:33:13,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:16,496 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=669660.0, ans=0.125 2023-09-30 09:33:17,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:20,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:23,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:27,509 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=669726.6666666666, ans=0.0 2023-09-30 09:33:28,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 09:33:28,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:30,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:30,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:33:30,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:34,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:35,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:33:35,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:35,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:33:37,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:33:37,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:33:42,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:43,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:43,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:45,378 INFO [train.py:1039] (0/4) Epoch 19, batch 4850, loss[loss=0.1675, simple_loss=0.2411, pruned_loss=0.04695, over 20187.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2537, pruned_loss=0.05128, over 4726913.68 frames. ], batch size: 44, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:33:45,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 09:33:47,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 09:33:47,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:33:47,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:50,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:52,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=669793.3333333334, ans=0.1 2023-09-30 09:33:57,327 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=669793.3333333334, ans=0.0 2023-09-30 09:34:00,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 09:34:00,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:05,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:07,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:34:07,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:11,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:13,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:34:14,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:34:14,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 09:34:19,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:34:20,831 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:34:20,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:34:22,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:34:22,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 09:34:25,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:25,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:28,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:28,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 09:34:30,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 09:34:31,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:34:39,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:34:39,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 09:34:42,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:34:42,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:34:44,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:34:46,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 09:34:46,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:47,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 09:34:47,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:49,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:34:49,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 09:34:58,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:35:05,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:35:05,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:08,133 INFO [train.py:1039] (0/4) Epoch 19, batch 4900, loss[loss=0.1647, simple_loss=0.2427, pruned_loss=0.04333, over 24474.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2524, pruned_loss=0.05088, over 4734998.02 frames. ], batch size: 58, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:35:11,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 09:35:11,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:35:16,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:18,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:18,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:35:21,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 09:35:24,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 09:35:29,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 09:35:31,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 09:35:31,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:31,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:31,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:35:31,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:31,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:35:33,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 09:35:38,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 09:35:38,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:35:39,984 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 2.060e+02 2.332e+02 2.793e+02 4.381e+02, threshold=4.664e+02, percent-clipped=2.0 2023-09-30 09:35:41,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:35:41,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:43,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:35:44,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:46,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:35:46,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 09:35:47,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=670260.0, ans=0.125 2023-09-30 09:35:49,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:35:49,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:49,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 09:35:49,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 09:35:51,689 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=670260.0, ans=0.0 2023-09-30 09:35:53,692 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.58 vs. limit=6.0 2023-09-30 09:35:55,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 09:35:56,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:35:58,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:35:58,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:35:59,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:59,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:35:59,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:36:01,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 09:36:03,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:04,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:36:04,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:36:07,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 09:36:10,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:36:11,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 09:36:11,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 09:36:20,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:21,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:36:23,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 09:36:23,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:23,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:36:25,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:29,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:29,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:36:31,099 INFO [train.py:1039] (0/4) Epoch 19, batch 4950, loss[loss=0.1932, simple_loss=0.2751, pruned_loss=0.05566, over 23730.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2516, pruned_loss=0.05088, over 4719445.52 frames. ], batch size: 94, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:36:31,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:31,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:36:32,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:36:34,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:35,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:37,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 09:36:37,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 09:36:37,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:36:39,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 09:36:39,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:39,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:36:39,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:36:39,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:36:40,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:42,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:36:44,390 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:36:46,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:48,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=670526.6666666666, ans=0.0 2023-09-30 09:36:49,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:49,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:54,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:36:58,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:59,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:37:01,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:02,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:04,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:37:05,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 09:37:07,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 09:37:10,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:12,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:37:12,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:37:12,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=670593.3333333334, ans=0.07 2023-09-30 09:37:13,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:37:13,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:37:15,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:37:16,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:18,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:37:20,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:37:22,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:22,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:24,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 09:37:24,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:37:25,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=670660.0, ans=0.0 2023-09-30 09:37:25,571 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.30 vs. limit=15.0 2023-09-30 09:37:26,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:37:30,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:37:32,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:37:32,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:37:34,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:34,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:37:34,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:37:35,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:37:37,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:37:37,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:39,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 09:37:43,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:37:47,071 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=670726.6666666666, ans=0.125 2023-09-30 09:37:49,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 09:37:49,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:37:53,430 INFO [train.py:1039] (0/4) Epoch 19, batch 5000, loss[loss=0.1937, simple_loss=0.2594, pruned_loss=0.06402, over 23263.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2508, pruned_loss=0.05066, over 4714112.24 frames. ], batch size: 119, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:37:57,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:57,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:37:58,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 09:38:00,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 09:38:02,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:04,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 09:38:04,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:38:04,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:38:05,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 09:38:07,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:08,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:08,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 09:38:08,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:10,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:10,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 09:38:11,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 09:38:12,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:38:13,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 09:38:13,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:38:14,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:15,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:38:15,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 09:38:15,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 09:38:15,324 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=670860.0, ans=0.025 2023-09-30 09:38:18,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 09:38:18,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:18,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:19,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 09:38:19,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:38:22,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:22,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:24,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:38:26,581 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.806e+02 2.052e+02 2.286e+02 3.432e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 09:38:26,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 09:38:28,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:38:29,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:38:34,790 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 09:38:38,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:38,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:38,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:38:39,103 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=670926.6666666666, ans=0.0 2023-09-30 09:38:42,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=670993.3333333334, ans=0.1 2023-09-30 09:38:43,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 09:38:43,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:43,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:43,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:38:46,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 09:38:46,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:49,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:51,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:55,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 09:38:59,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:07,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:39:10,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:10,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:39:10,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:10,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:39:10,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:39:11,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:14,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:16,094 INFO [train.py:1039] (0/4) Epoch 19, batch 5050, loss[loss=0.1523, simple_loss=0.2305, pruned_loss=0.03699, over 24390.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.252, pruned_loss=0.05079, over 4722570.14 frames. ], batch size: 58, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:39:16,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 09:39:16,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:39:17,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:19,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:39:19,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 09:39:20,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:20,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:39:22,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:39:24,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:39:25,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:39:38,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 09:39:39,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:39:39,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:39:40,036 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=671193.3333333334, ans=0.0 2023-09-30 09:39:41,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 09:39:41,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:39:41,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=671193.3333333334, ans=0.125 2023-09-30 09:39:44,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:44,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:46,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:39:46,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 09:39:47,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 09:39:47,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:50,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:39:54,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:54,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 09:39:55,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:39:57,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=671260.0, ans=0.2 2023-09-30 09:40:00,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 09:40:01,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:40:01,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:40:03,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:03,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:40:06,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:06,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=671326.6666666666, ans=0.2 2023-09-30 09:40:07,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:40:09,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:09,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:40:09,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:40:09,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 09:40:11,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:40:13,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:40:19,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:40:19,145 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 09:40:19,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:40:19,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=671326.6666666666, ans=10.0 2023-09-30 09:40:20,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:22,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:22,141 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 09:40:23,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:23,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 09:40:23,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:26,084 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.25 vs. limit=22.5 2023-09-30 09:40:28,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:28,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:28,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 09:40:30,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 09:40:31,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:31,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:40:33,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:40:36,305 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 09:40:37,731 INFO [train.py:1039] (0/4) Epoch 19, batch 5100, loss[loss=0.1571, simple_loss=0.2335, pruned_loss=0.04035, over 24269.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2525, pruned_loss=0.05058, over 4730745.13 frames. ], batch size: 56, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:40:37,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:39,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=671460.0, ans=0.0 2023-09-30 09:40:40,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 09:40:41,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 09:40:43,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:44,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:48,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:48,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 09:40:50,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 09:40:54,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:54,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:40:58,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:58,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=671526.6666666666, ans=0.025 2023-09-30 09:41:00,911 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.78 vs. limit=6.0 2023-09-30 09:41:01,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 09:41:03,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:04,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:41:04,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:41:07,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 09:41:10,884 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.887e+02 2.088e+02 2.426e+02 5.296e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-30 09:41:11,037 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 09:41:11,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:12,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 09:41:12,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 09:41:16,364 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=671593.3333333334, ans=0.125 2023-09-30 09:41:17,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:19,359 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:41:26,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:41:30,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 09:41:30,219 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 09:41:30,232 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 09:41:33,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 09:41:33,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:34,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 09:41:36,812 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=671660.0, ans=0.125 2023-09-30 09:41:39,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 09:41:40,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:41:42,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:41:45,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 09:41:47,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:41:47,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 09:41:53,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:41:53,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:41:53,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:41:54,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:41:54,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:41:56,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:41:58,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 09:41:58,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 09:41:58,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 09:41:59,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:41:59,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 09:42:00,982 INFO [train.py:1039] (0/4) Epoch 19, batch 5150, loss[loss=0.1489, simple_loss=0.2235, pruned_loss=0.0371, over 24357.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2533, pruned_loss=0.0507, over 4718729.88 frames. ], batch size: 56, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:42:01,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:01,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:42:03,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:04,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:10,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:42:10,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 09:42:10,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:12,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:42:14,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:42:14,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:14,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:15,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:42:15,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:42:17,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 09:42:18,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:42:18,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:42:21,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:42:25,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 09:42:25,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:42:29,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:42:34,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 09:42:39,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:43,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:45,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:51,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:42:53,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:42:54,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 09:42:59,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:59,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:42:59,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:43:03,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:03,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:43:04,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 09:43:10,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:11,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=672060.0, ans=0.125 2023-09-30 09:43:12,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:43:15,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:43:15,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:43:15,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:43:16,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:43:16,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:43:17,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:43:17,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=672060.0, ans=0.0 2023-09-30 09:43:20,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:43:21,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:43:23,308 INFO [train.py:1039] (0/4) Epoch 19, batch 5200, loss[loss=0.2049, simple_loss=0.2558, pruned_loss=0.07703, over 19834.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2536, pruned_loss=0.05098, over 4720656.74 frames. ], batch size: 388, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:43:23,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:29,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 09:43:30,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:43:30,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:31,284 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=672126.6666666666, ans=0.0 2023-09-30 09:43:32,570 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=672126.6666666666, ans=0.0 2023-09-30 09:43:33,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:36,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:43:37,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:39,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 09:43:42,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:43:44,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:48,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 09:43:49,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:43:51,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:43:51,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 09:43:52,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 09:43:54,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 09:43:56,143 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.839e+02 2.019e+02 2.213e+02 3.440e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 09:43:56,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:56,281 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 09:43:56,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:57,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:57,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:43:59,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 09:44:00,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:02,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:05,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 09:44:05,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 09:44:07,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 09:44:11,143 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.78 vs. limit=15.0 2023-09-30 09:44:12,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 09:44:14,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:44:21,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:44:21,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:23,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 09:44:23,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:23,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 09:44:23,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:23,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:44:25,578 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=672326.6666666666, ans=0.0 2023-09-30 09:44:28,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:28,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:44:33,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:44:33,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:33,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:38,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:38,328 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=672393.3333333334, ans=0.0 2023-09-30 09:44:38,905 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=12.0 2023-09-30 09:44:39,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 09:44:39,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:41,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:44:41,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:42,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:44:42,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:44:45,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:46,778 INFO [train.py:1039] (0/4) Epoch 19, batch 5250, loss[loss=0.1623, simple_loss=0.2424, pruned_loss=0.04108, over 24484.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2523, pruned_loss=0.0506, over 4728099.93 frames. ], batch size: 66, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:44:47,180 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:44:48,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:48,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:44:50,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:44:56,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:56,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:44:58,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:45:01,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:45:02,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 09:45:02,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:45:04,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:45:15,338 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.20 vs. limit=15.0 2023-09-30 09:45:16,142 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=672526.6666666666, ans=0.1 2023-09-30 09:45:17,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=672593.3333333334, ans=0.2 2023-09-30 09:45:33,741 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.81 vs. limit=6.0 2023-09-30 09:45:36,325 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.47 vs. limit=15.0 2023-09-30 09:45:41,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=672660.0, ans=0.2 2023-09-30 09:45:51,821 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672726.6666666666, ans=0.1 2023-09-30 09:46:02,580 INFO [train.py:1039] (0/4) Epoch 19, batch 5300, loss[loss=0.1676, simple_loss=0.2437, pruned_loss=0.04576, over 24601.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2508, pruned_loss=0.05076, over 4702360.35 frames. ], batch size: 60, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:46:16,050 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-19.pt 2023-09-30 09:46:22,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:46:22,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 09:46:22,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 09:46:22,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:23,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:23,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:23,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:23,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:23,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:23,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:23,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:46:24,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:46:24,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 09:46:24,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 09:46:24,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 09:46:24,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:46:24,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 09:46:24,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 09:46:24,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:25,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:25,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:25,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:26,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:46:26,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:26,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:26,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:26,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:26,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:26,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:46:26,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:26,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:46:27,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 09:46:27,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:28,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:28,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 09:46:28,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 09:46:28,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:46:28,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:28,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 09:46:29,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 09:46:29,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:30,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:46:30,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:30,477 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 09:46:30,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 09:46:30,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:46:30,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:30,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 09:46:30,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 09:46:31,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 09:46:31,316 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:34,304 INFO [train.py:1039] (0/4) Epoch 20, batch 0, loss[loss=0.1696, simple_loss=0.2537, pruned_loss=0.04275, over 24648.00 frames. ], tot_loss[loss=0.1696, simple_loss=0.2537, pruned_loss=0.04275, over 24648.00 frames. ], batch size: 68, lr: 5.21e-03, grad_scale: 32.0 2023-09-30 09:46:34,305 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 09:46:47,937 INFO [train.py:1071] (0/4) Epoch 20, validation: loss=0.2867, simple_loss=0.2695, pruned_loss=0.152, over 1125622.00 frames. 2023-09-30 09:46:47,938 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20683MB 2023-09-30 09:46:49,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 09:46:49,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:46:52,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:46:55,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:57,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:46:57,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:57,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=672866.6666666666, ans=0.0 2023-09-30 09:46:58,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 09:47:01,566 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.839e+02 2.043e+02 2.275e+02 5.407e+02, threshold=4.087e+02, percent-clipped=3.0 2023-09-30 09:47:01,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 09:47:05,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:05,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:05,575 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=672933.3333333334, ans=0.125 2023-09-30 09:47:08,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:10,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:10,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:47:10,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:13,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 09:47:15,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:24,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:47:24,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:25,851 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 09:47:30,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:47:30,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:47:31,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:35,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:47:37,436 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.00 vs. limit=15.0 2023-09-30 09:47:39,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:46,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 09:47:50,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 09:47:50,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:47:50,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:47:51,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:47:53,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:56,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 09:47:59,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:48:01,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:48:01,607 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=673133.3333333334, ans=0.125 2023-09-30 09:48:04,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:04,774 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=673133.3333333334, ans=0.125 2023-09-30 09:48:08,940 INFO [train.py:1039] (0/4) Epoch 20, batch 50, loss[loss=0.1922, simple_loss=0.2613, pruned_loss=0.06162, over 23702.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2542, pruned_loss=0.05068, over 1077654.09 frames. ], batch size: 232, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:48:09,030 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 09:48:10,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:48:13,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:16,198 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.41 vs. limit=22.5 2023-09-30 09:48:16,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:48:16,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 09:48:17,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:48:17,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:48:19,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:23,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:23,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=673200.0, ans=0.1 2023-09-30 09:48:24,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:29,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 09:48:29,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:36,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:48:38,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 09:48:39,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 09:48:41,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:48:43,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:48:43,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:44,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:48:45,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:48:45,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:48:45,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:53,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:48:56,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:57,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:48:57,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 09:48:57,370 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=673400.0, ans=0.125 2023-09-30 09:49:00,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:49:00,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:49:00,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 09:49:00,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:03,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 09:49:10,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:10,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:49:12,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:12,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:12,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:15,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 09:49:15,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 09:49:17,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:17,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:18,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:49:18,948 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=673466.6666666666, ans=0.1 2023-09-30 09:49:20,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:20,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 09:49:21,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 09:49:21,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:49:23,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:23,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:49:25,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 09:49:25,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 09:49:27,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:28,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:29,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:49:30,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:49:31,409 INFO [train.py:1039] (0/4) Epoch 20, batch 100, loss[loss=0.1583, simple_loss=0.242, pruned_loss=0.03732, over 24674.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2552, pruned_loss=0.05112, over 1896714.44 frames. ], batch size: 65, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:49:33,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:49:35,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:49:39,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=673533.3333333334, ans=0.125 2023-09-30 09:49:40,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:42,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 09:49:42,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:45,879 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=15.0 2023-09-30 09:49:46,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:49:46,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:48,164 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.857e+02 2.032e+02 2.240e+02 3.945e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 09:49:48,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:48,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:48,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:49,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 09:49:51,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:49:51,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:53,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:53,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:56,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=673600.0, ans=0.035 2023-09-30 09:49:57,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 09:49:59,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:00,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:02,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:50:05,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:50:08,774 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 09:50:08,813 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 09:50:10,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:10,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:50:14,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:50:15,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:19,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:24,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:25,756 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 09:50:27,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:50:30,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:50:31,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:50:34,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:37,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:38,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=673800.0, ans=0.125 2023-09-30 09:50:41,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:50:43,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:50:46,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:46,326 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=673800.0, ans=0.2 2023-09-30 09:50:48,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:49,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:49,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:50:49,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:51,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 09:50:51,197 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 09:50:51,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:52,554 INFO [train.py:1039] (0/4) Epoch 20, batch 150, loss[loss=0.1623, simple_loss=0.2381, pruned_loss=0.04324, over 23551.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2557, pruned_loss=0.05187, over 2516426.21 frames. ], batch size: 135, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:50:52,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:50:55,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:55,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:55,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:50:55,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:50:55,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:50:56,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:56,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:58,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:59,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:50:59,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:51:02,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:03,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=673866.6666666666, ans=0.0 2023-09-30 09:51:04,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:51:04,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:04,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=673866.6666666666, ans=0.1 2023-09-30 09:51:06,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:09,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:09,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:09,778 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.46 vs. limit=22.5 2023-09-30 09:51:12,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:51:13,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:17,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 09:51:17,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 09:51:17,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 09:51:17,923 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:51:22,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:51:22,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:51:22,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:51:24,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:51:24,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:25,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:25,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:29,379 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 09:51:30,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:31,081 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=674000.0, ans=0.125 2023-09-30 09:51:31,716 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.94 vs. limit=12.0 2023-09-30 09:51:36,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=674000.0, ans=0.015 2023-09-30 09:51:37,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:42,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:51:42,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 09:51:43,268 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.03 vs. limit=12.0 2023-09-30 09:51:46,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:51:46,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:46,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:51:49,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:51:52,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:52,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:51:53,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:55,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 09:51:59,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:01,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:01,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:52:01,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:52:04,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:07,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 09:52:09,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:52:10,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:52:10,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:13,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:52:13,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 09:52:13,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:52:15,046 INFO [train.py:1039] (0/4) Epoch 20, batch 200, loss[loss=0.1642, simple_loss=0.248, pruned_loss=0.04019, over 24659.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.256, pruned_loss=0.0523, over 3007802.26 frames. ], batch size: 68, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:52:15,142 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 09:52:18,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:21,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:52:21,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:52:24,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 09:52:26,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:26,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:28,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 09:52:30,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:52:31,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:33,174 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.908e+02 2.091e+02 2.356e+02 3.035e+02, threshold=4.181e+02, percent-clipped=0.0 2023-09-30 09:52:34,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:37,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:52:37,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:37,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:58,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:52:58,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:53:00,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:53:01,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:02,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:53:02,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:53:05,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:05,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:53:07,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:07,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:10,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 09:53:10,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:53:10,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:12,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=674400.0, ans=0.04949747468305833 2023-09-30 09:53:13,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:53:15,834 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=674400.0, ans=0.2 2023-09-30 09:53:21,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:26,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=674466.6666666666, ans=0.015 2023-09-30 09:53:27,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:29,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:53:34,273 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:34,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=674533.3333333334, ans=0.125 2023-09-30 09:53:36,189 INFO [train.py:1039] (0/4) Epoch 20, batch 250, loss[loss=0.1699, simple_loss=0.2406, pruned_loss=0.04953, over 23616.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.256, pruned_loss=0.05181, over 3388032.53 frames. ], batch size: 149, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:53:37,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 09:53:37,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:37,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:53:37,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:39,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:53:41,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 09:53:42,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:53:42,849 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 09:53:46,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:48,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:53:49,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:49,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:51,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:53:51,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:54,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:57,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:53:58,520 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.35 vs. limit=15.0 2023-09-30 09:54:09,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:10,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:54:10,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:54:16,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:54:18,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:54:19,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:54:19,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:21,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:54:21,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:54:21,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:24,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:54:28,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 09:54:28,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:29,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:54:29,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:54:29,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:54:29,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=674733.3333333334, ans=0.1 2023-09-30 09:54:31,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:54:31,614 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=674733.3333333334, ans=0.125 2023-09-30 09:54:32,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:54:32,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:54:34,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:36,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=674733.3333333334, ans=0.0 2023-09-30 09:54:37,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:54:37,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:41,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:54:45,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:51,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:54:54,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:56,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:54:59,544 INFO [train.py:1039] (0/4) Epoch 20, batch 300, loss[loss=0.1918, simple_loss=0.2575, pruned_loss=0.06302, over 23434.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2547, pruned_loss=0.05166, over 3685938.30 frames. ], batch size: 134, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:54:59,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 09:55:01,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:01,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:55:01,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 09:55:02,533 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.27 vs. limit=22.5 2023-09-30 09:55:03,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:55:03,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:55:04,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 09:55:09,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:55:10,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:13,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:55:14,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 09:55:16,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:55:16,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:55:17,751 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.911e+02 2.106e+02 2.458e+02 4.276e+02, threshold=4.211e+02, percent-clipped=1.0 2023-09-30 09:55:17,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 09:55:17,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:21,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:55:24,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:55:25,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 09:55:29,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 09:55:29,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:32,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:36,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:36,337 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 09:55:36,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:55:39,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:55:40,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:55:40,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:41,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=675000.0, ans=0.1 2023-09-30 09:55:46,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:55:46,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 09:55:49,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:55:52,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:53,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 09:55:54,380 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.32 vs. limit=15.0 2023-09-30 09:55:55,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:59,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:01,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:56:01,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 09:56:07,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:07,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:56:08,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:11,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:56:12,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 09:56:12,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:56:12,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:14,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 09:56:17,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:17,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:17,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=675133.3333333334, ans=0.0 2023-09-30 09:56:19,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:19,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:20,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:22,136 INFO [train.py:1039] (0/4) Epoch 20, batch 350, loss[loss=0.1618, simple_loss=0.2404, pruned_loss=0.04156, over 24314.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2522, pruned_loss=0.05123, over 3905114.31 frames. ], batch size: 61, lr: 5.20e-03, grad_scale: 4.0 2023-09-30 09:56:24,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:24,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:56:27,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:34,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:37,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:38,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:42,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 09:56:43,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:44,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 09:56:47,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:48,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 09:56:48,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:50,621 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=675266.6666666666, ans=0.125 2023-09-30 09:56:51,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 09:56:53,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:56:55,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:56,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:58,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:56:58,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:58,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:57:01,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:01,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:09,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:09,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:57:09,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:57:11,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:15,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 09:57:15,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:22,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:22,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:22,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:57:23,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 09:57:25,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:27,037 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 09:57:28,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 09:57:28,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:28,853 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=675466.6666666666, ans=0.125 2023-09-30 09:57:31,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:57:31,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 09:57:34,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=675466.6666666666, ans=0.125 2023-09-30 09:57:35,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:35,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=675466.6666666666, ans=0.2 2023-09-30 09:57:36,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:57:38,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:40,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:40,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:42,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:43,418 INFO [train.py:1039] (0/4) Epoch 20, batch 400, loss[loss=0.1811, simple_loss=0.2478, pruned_loss=0.0572, over 22715.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2517, pruned_loss=0.05092, over 4089453.56 frames. ], batch size: 322, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:57:43,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:45,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:57:47,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 09:57:48,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:48,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:50,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:57:50,215 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=675533.3333333334, ans=0.125 2023-09-30 09:57:52,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:53,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:55,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:56,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 09:57:57,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 09:57:57,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:58:00,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 09:58:00,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:02,183 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:58:03,699 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.898e+02 2.066e+02 2.335e+02 3.981e+02, threshold=4.133e+02, percent-clipped=0.0 2023-09-30 09:58:03,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:58:03,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:04,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 09:58:05,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:58:05,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:05,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:06,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:58:09,347 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 09:58:10,019 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.31 vs. limit=15.0 2023-09-30 09:58:10,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 09:58:15,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:58:18,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:19,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 09:58:20,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 09:58:23,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:58:23,878 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=675666.6666666666, ans=6.0 2023-09-30 09:58:25,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:31,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=675733.3333333334, ans=0.125 2023-09-30 09:58:33,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 09:58:34,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:58:36,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 09:58:38,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:41,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:58:41,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 09:58:45,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:58:47,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:58:48,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:51,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:51,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 09:58:53,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:58:54,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 09:58:58,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:58:58,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:59:01,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 09:59:01,835 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=675800.0, ans=0.2 2023-09-30 09:59:03,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:59:03,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:59:04,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:59:04,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 09:59:04,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:59:05,529 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.51 vs. limit=22.5 2023-09-30 09:59:06,106 INFO [train.py:1039] (0/4) Epoch 20, batch 450, loss[loss=0.1997, simple_loss=0.2709, pruned_loss=0.0642, over 23679.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2527, pruned_loss=0.05187, over 4218012.71 frames. ], batch size: 135, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:59:06,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:59:07,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:59:07,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 09:59:09,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:59:10,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:59:13,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:59:25,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:25,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:59:26,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 09:59:28,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 09:59:32,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:59:33,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:34,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:35,707 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=675933.3333333334, ans=0.0 2023-09-30 09:59:40,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:41,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:44,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 09:59:44,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 09:59:47,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 09:59:49,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:59:49,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:51,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:59:53,436 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 09:59:53,450 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 09:59:53,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:57,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:59:58,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:00:01,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:00:01,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:00:03,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:00:05,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 10:00:06,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:09,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:00:09,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:00:12,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 10:00:16,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:00:17,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 10:00:19,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 10:00:19,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:26,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:00:27,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:28,041 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=676200.0, ans=0.0 2023-09-30 10:00:29,609 INFO [train.py:1039] (0/4) Epoch 20, batch 500, loss[loss=0.1603, simple_loss=0.2521, pruned_loss=0.03424, over 24408.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2534, pruned_loss=0.05155, over 4338564.81 frames. ], batch size: 69, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:00:29,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:00:29,762 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 10:00:33,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:35,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:00:35,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:35,155 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 10:00:37,299 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.75 vs. limit=22.5 2023-09-30 10:00:37,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 10:00:38,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:41,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:00:47,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:00:48,765 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.813e+02 1.975e+02 2.252e+02 5.149e+02, threshold=3.950e+02, percent-clipped=1.0 2023-09-30 10:00:48,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:00:51,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:51,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:52,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:05,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:01:05,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:01:05,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 10:01:05,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:01:10,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:01:10,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:01:10,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:01:10,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:11,733 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.64 vs. limit=15.0 2023-09-30 10:01:12,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 10:01:15,453 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 10:01:17,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:18,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:20,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:01:23,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 10:01:26,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:01:28,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:33,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:34,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:42,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:47,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 10:01:47,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:47,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:49,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 10:01:50,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:01:52,023 INFO [train.py:1039] (0/4) Epoch 20, batch 550, loss[loss=0.1791, simple_loss=0.252, pruned_loss=0.05309, over 23337.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2543, pruned_loss=0.05154, over 4428735.32 frames. ], batch size: 93, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:01:52,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:58,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 10:01:58,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=676533.3333333334, ans=0.125 2023-09-30 10:01:59,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 10:01:59,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:59,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 10:02:01,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:02:01,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:02,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:02:06,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:02:07,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:02:09,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 10:02:09,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:02:13,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:13,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:17,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:17,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:23,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 10:02:23,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 10:02:24,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:02:30,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:02:30,915 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:32,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:02:32,862 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=676666.6666666666, ans=0.02 2023-09-30 10:02:36,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:36,857 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 10:02:37,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:39,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:02:42,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:42,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:02:42,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:02:44,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:45,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 10:02:48,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 10:02:49,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:02:49,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:49,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:02:49,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:52,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:02:54,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:02:54,688 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:02:57,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:02:58,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:59,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 10:03:00,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:03:00,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:02,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:03:02,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:03,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:03:03,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:03:10,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 10:03:11,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=676800.0, ans=0.0 2023-09-30 10:03:13,699 INFO [train.py:1039] (0/4) Epoch 20, batch 600, loss[loss=0.2246, simple_loss=0.2812, pruned_loss=0.08398, over 19539.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2548, pruned_loss=0.05186, over 4489689.89 frames. ], batch size: 388, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:03:13,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 10:03:14,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:03:15,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:03:15,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:17,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=676866.6666666666, ans=0.0 2023-09-30 10:03:25,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:03:25,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:03:28,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 10:03:29,341 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.30 vs. limit=22.5 2023-09-30 10:03:31,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:03:33,018 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.840e+02 2.105e+02 2.465e+02 3.570e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 10:03:33,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:03:36,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:37,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 10:03:37,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:03:42,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 10:03:47,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:03:47,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:48,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:03:54,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:03:54,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:03:54,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:54,624 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=677000.0, ans=0.125 2023-09-30 10:03:57,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=677000.0, ans=22.5 2023-09-30 10:04:02,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:04:06,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:06,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:04:06,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:04:12,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 10:04:17,025 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=677133.3333333334, ans=0.125 2023-09-30 10:04:18,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:04:18,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:04:23,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 10:04:25,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:04:26,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 10:04:27,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:04:28,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:04:34,491 INFO [train.py:1039] (0/4) Epoch 20, batch 650, loss[loss=0.1914, simple_loss=0.2678, pruned_loss=0.05744, over 23707.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2535, pruned_loss=0.05171, over 4524732.45 frames. ], batch size: 85, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:04:36,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:04:37,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:04:39,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:04:42,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:04:43,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:04:46,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 10:04:47,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=677200.0, ans=0.125 2023-09-30 10:04:48,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:53,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:04:53,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:04:57,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:01,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 10:05:04,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:04,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:06,555 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:05:09,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:09,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:05:11,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:11,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:12,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:05:14,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:15,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:05:17,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:05:17,356 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 10:05:17,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:17,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:20,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:20,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:21,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:22,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:05:23,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 10:05:23,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:05:23,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:05:27,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:05:27,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:29,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:05:31,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 10:05:33,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 10:05:33,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:33,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:33,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:05:34,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:35,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:41,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:41,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:43,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:46,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:46,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:05:46,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:51,328 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.60 vs. limit=15.0 2023-09-30 10:05:52,715 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=677466.6666666666, ans=0.125 2023-09-30 10:05:53,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:05:53,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:53,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:05:55,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:55,730 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=677533.3333333334, ans=0.0 2023-09-30 10:05:56,758 INFO [train.py:1039] (0/4) Epoch 20, batch 700, loss[loss=0.1583, simple_loss=0.2294, pruned_loss=0.04361, over 24350.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2513, pruned_loss=0.05115, over 4569936.40 frames. ], batch size: 56, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:06:00,607 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 10:06:02,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 10:06:04,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 10:06:04,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:06,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:06:08,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 10:06:14,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:15,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:06:17,044 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.862e+02 2.095e+02 2.460e+02 3.900e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 10:06:18,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:18,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:06:20,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:06:23,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:25,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:06:25,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:06:26,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 10:06:29,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 10:06:34,318 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=677666.6666666666, ans=0.0 2023-09-30 10:06:35,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:06:35,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:06:37,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:06:41,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:06:41,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 10:06:45,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:47,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:06:47,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 10:06:50,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:51,142 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=677733.3333333334, ans=0.0 2023-09-30 10:06:51,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=677733.3333333334, ans=0.125 2023-09-30 10:06:52,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:55,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:00,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:07:00,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 10:07:07,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 10:07:07,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 10:07:10,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:10,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:11,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:13,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:13,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 10:07:18,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 10:07:18,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 10:07:18,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 10:07:19,749 INFO [train.py:1039] (0/4) Epoch 20, batch 750, loss[loss=0.1493, simple_loss=0.2249, pruned_loss=0.03684, over 22506.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2509, pruned_loss=0.05119, over 4571105.31 frames. ], batch size: 49, lr: 5.19e-03, grad_scale: 8.0 2023-09-30 10:07:21,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 10:07:21,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 10:07:21,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:07:23,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 10:07:23,200 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=677866.6666666666, ans=0.125 2023-09-30 10:07:24,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:24,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:27,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:30,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:30,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:07:32,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:33,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:07:35,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:07:36,095 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.57 vs. limit=15.0 2023-09-30 10:07:36,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:07:38,832 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=677933.3333333334, ans=0.125 2023-09-30 10:07:40,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:40,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:40,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 10:07:43,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:07:43,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:45,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:46,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:07:47,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 10:07:47,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:49,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 10:07:49,969 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 10:07:50,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 10:07:50,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:07:50,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:07:53,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:07:58,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=678000.0, ans=0.125 2023-09-30 10:07:59,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:59,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:07:59,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:08:02,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:08:02,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=678000.0, ans=0.0 2023-09-30 10:08:04,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:04,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 10:08:04,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=678000.0, ans=0.125 2023-09-30 10:08:05,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:08:05,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 10:08:07,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:08:07,638 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=678066.6666666666, ans=0.125 2023-09-30 10:08:11,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:08:12,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 10:08:12,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:19,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:22,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:08:22,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:24,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:08:28,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 10:08:28,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:29,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:32,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:33,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:36,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:37,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:08:39,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=678200.0, ans=0.1 2023-09-30 10:08:40,977 INFO [train.py:1039] (0/4) Epoch 20, batch 800, loss[loss=0.1779, simple_loss=0.2463, pruned_loss=0.05477, over 23966.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2523, pruned_loss=0.05187, over 4594616.30 frames. ], batch size: 196, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:08:44,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:44,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:47,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:47,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:48,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:48,827 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:50,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:54,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:55,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:08:58,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 10:09:00,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:01,437 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.889e+02 2.125e+02 2.539e+02 3.349e+02, threshold=4.249e+02, percent-clipped=0.0 2023-09-30 10:09:01,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:09:01,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:03,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:03,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 10:09:03,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:03,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 10:09:07,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:10,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:12,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:09:12,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:16,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:16,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:17,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=678333.3333333334, ans=0.1 2023-09-30 10:09:17,183 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=678333.3333333334, ans=0.0 2023-09-30 10:09:18,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=678333.3333333334, ans=0.125 2023-09-30 10:09:23,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:09:23,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:09:24,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 10:09:27,127 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 10:09:27,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 10:09:27,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:09:27,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:09:27,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=678333.3333333334, ans=0.0 2023-09-30 10:09:29,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:31,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:09:36,375 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 10:09:37,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 10:09:39,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:09:39,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:09:42,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:09:45,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:46,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 10:09:47,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:51,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 10:09:58,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:10:00,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:10:02,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 10:10:02,828 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=678533.3333333334, ans=0.0 2023-09-30 10:10:03,118 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.41 vs. limit=22.5 2023-09-30 10:10:04,528 INFO [train.py:1039] (0/4) Epoch 20, batch 850, loss[loss=0.1925, simple_loss=0.2687, pruned_loss=0.05817, over 23971.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2529, pruned_loss=0.05189, over 4620306.14 frames. ], batch size: 80, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:10:04,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:10:04,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:06,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 10:10:07,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:07,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:10:08,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:10,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:10:11,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:10:13,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 10:10:13,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 10:10:13,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 10:10:14,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:10:14,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:10:17,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:17,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:17,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:10:18,322 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=678533.3333333334, ans=0.125 2023-09-30 10:10:23,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:24,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:25,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 10:10:27,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 10:10:30,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:31,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=678600.0, ans=0.125 2023-09-30 10:10:32,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 10:10:32,563 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=678600.0, ans=0.0 2023-09-30 10:10:38,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 10:10:39,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 10:10:40,141 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.31 vs. limit=15.0 2023-09-30 10:10:43,227 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 10:10:43,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:43,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:10:43,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:10:46,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:47,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:47,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 10:10:48,421 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.12 vs. limit=12.0 2023-09-30 10:10:49,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:51,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:51,172 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:10:51,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:10:52,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:10:54,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=678733.3333333334, ans=0.5 2023-09-30 10:10:55,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:10:55,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 10:11:00,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:11:00,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:02,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:11:02,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:03,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:04,165 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=678733.3333333334, ans=0.2 2023-09-30 10:11:08,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:11:09,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:11:13,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:11:13,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:14,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:11:17,594 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.60 vs. limit=15.0 2023-09-30 10:11:23,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:11:24,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:24,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 10:11:26,246 INFO [train.py:1039] (0/4) Epoch 20, batch 900, loss[loss=0.1855, simple_loss=0.2582, pruned_loss=0.05645, over 23708.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2536, pruned_loss=0.05152, over 4659484.57 frames. ], batch size: 212, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:11:26,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:26,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:27,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 10:11:36,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:11:37,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:39,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 10:11:40,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:11:40,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 10:11:43,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:11:45,397 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.863e+02 2.352e+02 2.808e+02 3.950e+02, threshold=4.705e+02, percent-clipped=0.0 2023-09-30 10:11:45,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:45,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:11:45,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:11:45,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:11:54,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:54,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:55,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:11:59,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:05,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 10:12:05,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:12:13,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:12:15,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:12:15,170 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 10:12:16,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 10:12:22,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:12:22,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:12:22,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:12:28,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:30,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:12:31,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 10:12:31,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:34,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 10:12:36,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:12:37,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:38,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=679133.3333333334, ans=0.125 2023-09-30 10:12:39,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:12:39,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:12:43,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 10:12:43,067 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 10:12:43,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:12:43,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 10:12:47,489 INFO [train.py:1039] (0/4) Epoch 20, batch 950, loss[loss=0.1777, simple_loss=0.2415, pruned_loss=0.05698, over 23864.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2534, pruned_loss=0.05131, over 4672816.37 frames. ], batch size: 195, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:12:47,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:50,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 10:12:56,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:00,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:13:03,199 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 10:13:06,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:07,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:08,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:08,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:13:09,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 10:13:11,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:13:13,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:13,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 10:13:14,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:18,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:13:19,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 10:13:22,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:13:24,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:26,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:13:32,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:13:32,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:36,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 10:13:37,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:13:37,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:13:39,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:40,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:40,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:13:44,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 10:13:46,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:13:47,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:47,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:47,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 10:13:48,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=679400.0, ans=0.025 2023-09-30 10:13:49,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:49,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:13:49,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 10:13:53,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:13:57,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:14:00,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:02,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 10:14:02,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 10:14:06,905 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=679466.6666666666, ans=0.125 2023-09-30 10:14:07,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:14:10,688 INFO [train.py:1039] (0/4) Epoch 20, batch 1000, loss[loss=0.1562, simple_loss=0.2326, pruned_loss=0.0399, over 24340.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2531, pruned_loss=0.05139, over 4672484.33 frames. ], batch size: 56, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:14:13,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 10:14:15,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:20,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:14:20,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 10:14:20,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 10:14:25,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:25,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:28,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:29,652 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.877e+02 1.981e+02 2.279e+02 3.470e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 10:14:29,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 10:14:33,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=679600.0, ans=0.2 2023-09-30 10:14:33,076 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=679600.0, ans=0.125 2023-09-30 10:14:36,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 10:14:36,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=679600.0, ans=0.2 2023-09-30 10:14:37,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 10:14:37,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:38,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 10:14:39,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 10:14:39,994 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=679600.0, ans=0.125 2023-09-30 10:14:40,604 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.59 vs. limit=22.5 2023-09-30 10:14:41,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 10:14:43,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:45,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:55,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:55,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:14:56,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:56,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:56,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 10:14:56,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:57,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:14:58,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:58,576 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 10:15:01,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 10:15:04,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 10:15:04,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 10:15:05,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=679733.3333333334, ans=0.125 2023-09-30 10:15:06,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:15:10,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=679733.3333333334, ans=0.125 2023-09-30 10:15:15,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:15,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:15:15,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:17,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:15:19,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 10:15:21,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:15:21,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 10:15:22,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 10:15:24,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:24,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:15:26,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:15:30,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:15:31,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:15:32,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=679866.6666666666, ans=0.2 2023-09-30 10:15:34,041 INFO [train.py:1039] (0/4) Epoch 20, batch 1050, loss[loss=0.172, simple_loss=0.2442, pruned_loss=0.04987, over 23438.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2517, pruned_loss=0.05065, over 4680331.87 frames. ], batch size: 119, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:15:35,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:15:37,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:15:40,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:15:41,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:43,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:15:46,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:15:48,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:15:50,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:15:52,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:15:52,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:15:53,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:15:55,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 10:15:55,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:15:55,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 10:15:58,988 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:58,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 10:15:59,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:16:04,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:16:04,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:16:04,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:16:07,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 10:16:07,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 10:16:09,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:16:11,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 10:16:13,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 10:16:14,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:19,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:16:22,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:16:23,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:16:23,277 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=680066.6666666666, ans=0.125 2023-09-30 10:16:24,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:16:26,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=680066.6666666666, ans=0.125 2023-09-30 10:16:29,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:16:32,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 10:16:34,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 10:16:34,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 10:16:34,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:34,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:16:38,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 10:16:41,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:16:42,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:42,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:16:44,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:44,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:48,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:48,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 10:16:49,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:49,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 10:16:49,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 10:16:51,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:16:56,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:16:58,140 INFO [train.py:1039] (0/4) Epoch 20, batch 1100, loss[loss=0.1833, simple_loss=0.2493, pruned_loss=0.05866, over 23679.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2514, pruned_loss=0.05054, over 4683240.18 frames. ], batch size: 232, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:17:01,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:17:06,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:17:08,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:17:08,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=680200.0, ans=0.125 2023-09-30 10:17:10,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:10,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 10:17:12,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:17:13,622 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:17:15,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:17:16,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:17:17,916 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.800e+02 1.958e+02 2.202e+02 3.142e+02, threshold=3.917e+02, percent-clipped=0.0 2023-09-30 10:17:19,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:17:19,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 10:17:21,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:17:23,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:23,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:17:26,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:17:27,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:17:30,194 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.16 vs. limit=6.0 2023-09-30 10:17:33,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:17:36,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 10:17:36,455 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 10:17:37,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:40,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:41,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:17:41,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:17:43,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 10:17:43,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:17:43,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:17:45,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:17:45,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:45,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 10:17:49,994 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:17:50,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 10:17:53,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:17:59,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:18:02,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 10:18:04,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 10:18:05,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:08,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:09,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:11,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 10:18:12,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:18:12,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:14,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 10:18:14,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:18:16,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 10:18:16,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:18:16,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:18:17,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:18:21,501 INFO [train.py:1039] (0/4) Epoch 20, batch 1150, loss[loss=0.1678, simple_loss=0.2446, pruned_loss=0.04547, over 23648.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.252, pruned_loss=0.05057, over 4703914.77 frames. ], batch size: 149, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:18:24,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:28,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:18:30,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:30,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:18:30,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 10:18:30,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:34,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 10:18:36,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:36,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:18:40,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 10:18:43,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:47,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:48,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:18:48,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 10:18:48,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:18:50,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:53,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 10:18:55,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:57,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:19:06,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:07,089 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=680666.6666666666, ans=0.0 2023-09-30 10:19:08,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=680733.3333333334, ans=0.125 2023-09-30 10:19:15,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:17,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 10:19:17,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:17,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:26,229 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 10:19:27,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:32,835 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 10:19:38,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:38,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:19:39,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:19:39,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:19:41,989 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.56 vs. limit=15.0 2023-09-30 10:19:43,295 INFO [train.py:1039] (0/4) Epoch 20, batch 1200, loss[loss=0.1825, simple_loss=0.2579, pruned_loss=0.05361, over 23432.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2531, pruned_loss=0.05089, over 4709016.39 frames. ], batch size: 106, lr: 5.18e-03, grad_scale: 32.0 2023-09-30 10:19:44,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:49,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:19:49,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:19:51,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:19:51,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:52,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:19:52,949 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:19:55,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:19:57,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:19:59,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:59,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:02,633 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.868e+02 2.067e+02 2.397e+02 3.713e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 10:20:02,859 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 10:20:03,807 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.20 vs. limit=15.0 2023-09-30 10:20:04,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 10:20:06,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=680933.3333333334, ans=0.125 2023-09-30 10:20:08,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:20:11,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:20:14,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:15,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:20:15,825 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 10:20:16,102 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=681000.0, ans=0.2 2023-09-30 10:20:17,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:25,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:20:25,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:20:25,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 10:20:27,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:20:30,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 10:20:34,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 10:20:34,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:36,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:38,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:38,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:20:41,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:41,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:20:43,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:20:43,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 10:20:44,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:20:44,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:20:44,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:20:48,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:20:48,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:51,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=681133.3333333334, ans=0.0 2023-09-30 10:20:52,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:20:55,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:20:57,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 10:21:00,207 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.06 vs. limit=15.0 2023-09-30 10:21:01,145 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 10:21:03,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:05,361 INFO [train.py:1039] (0/4) Epoch 20, batch 1250, loss[loss=0.1923, simple_loss=0.2831, pruned_loss=0.05076, over 24678.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2535, pruned_loss=0.05104, over 4711199.55 frames. ], batch size: 73, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:21:06,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:21:09,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:21:11,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:21:14,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 10:21:17,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:21:19,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:20,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 10:21:23,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:21:25,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:21:29,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:21:30,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:31,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:21:31,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:32,705 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.69 vs. limit=5.0 2023-09-30 10:21:33,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:21:38,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:21:38,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:21:38,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:40,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:41,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:43,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:21:45,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:21:51,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 10:21:51,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:21:53,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:21:54,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 10:21:54,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:54,714 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 10:21:56,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:56,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:58,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:21:58,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=681400.0, ans=0.125 2023-09-30 10:22:01,255 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:22:02,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:22:02,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:22:04,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 10:22:05,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 10:22:05,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 10:22:08,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:09,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 10:22:11,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:15,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:22:15,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:22:16,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 10:22:16,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:22:18,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:22:18,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:22:18,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:22:19,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 10:22:23,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:25,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:22:27,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:22:28,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:22:30,058 INFO [train.py:1039] (0/4) Epoch 20, batch 1300, loss[loss=0.1562, simple_loss=0.2344, pruned_loss=0.03898, over 23321.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2538, pruned_loss=0.05092, over 4719386.74 frames. ], batch size: 93, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:22:31,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:33,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 10:22:33,868 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=681533.3333333334, ans=0.125 2023-09-30 10:22:37,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:41,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:22:41,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:22:44,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:44,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:22:46,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 10:22:52,550 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.925e+02 2.165e+02 2.491e+02 3.486e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 10:22:52,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:22:54,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:22:56,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 10:22:57,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:23:03,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:04,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:06,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:23:06,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:07,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:23:07,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:23:07,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 10:23:14,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:23:16,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:23:16,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=681666.6666666666, ans=0.035 2023-09-30 10:23:17,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 10:23:17,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:23:20,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:23:23,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:23:23,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 10:23:25,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:25,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 10:23:27,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:31,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:31,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:23:34,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 10:23:35,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 10:23:35,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=681800.0, ans=0.125 2023-09-30 10:23:36,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 10:23:40,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:23:44,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 10:23:45,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:52,105 INFO [train.py:1039] (0/4) Epoch 20, batch 1350, loss[loss=0.1943, simple_loss=0.2744, pruned_loss=0.05713, over 23366.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2533, pruned_loss=0.05051, over 4723466.70 frames. ], batch size: 93, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:23:52,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 10:23:56,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:00,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:03,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:24:04,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:06,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:24:07,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:10,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:12,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 10:24:13,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=681933.3333333334, ans=0.2 2023-09-30 10:24:15,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:15,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:24:18,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 10:24:18,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:24:19,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:24:19,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 10:24:20,113 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=681933.3333333334, ans=0.125 2023-09-30 10:24:21,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 10:24:24,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 10:24:27,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:27,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 10:24:38,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=682000.0, ans=0.2 2023-09-30 10:24:40,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:50,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 10:24:53,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:54,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 10:24:54,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:54,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:58,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:25:00,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=682133.3333333334, ans=0.125 2023-09-30 10:25:01,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 10:25:01,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:25:08,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 10:25:09,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 10:25:14,392 INFO [train.py:1039] (0/4) Epoch 20, batch 1400, loss[loss=0.1674, simple_loss=0.2483, pruned_loss=0.04323, over 23195.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2511, pruned_loss=0.05025, over 4720588.95 frames. ], batch size: 105, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:25:16,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 10:25:18,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:25:18,421 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=682200.0, ans=0.125 2023-09-30 10:25:21,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:25:23,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:25:29,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 10:25:32,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 10:25:37,263 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.836e+02 1.987e+02 2.257e+02 3.606e+02, threshold=3.975e+02, percent-clipped=0.0 2023-09-30 10:25:42,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:25:44,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:25:46,821 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.27 vs. limit=15.0 2023-09-30 10:25:47,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:25:47,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:25:52,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:25:53,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:25:55,285 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.24 vs. limit=6.0 2023-09-30 10:25:58,122 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=682333.3333333334, ans=0.2 2023-09-30 10:26:03,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:03,935 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:09,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 10:26:09,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:26:09,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:26:10,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:26:10,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:12,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:26:12,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:26:13,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:26:14,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 10:26:14,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:26:21,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=682466.6666666666, ans=0.125 2023-09-30 10:26:22,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:25,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:26:29,799 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=682466.6666666666, ans=0.125 2023-09-30 10:26:31,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 10:26:32,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:26:34,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:26:35,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:26:35,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:37,962 INFO [train.py:1039] (0/4) Epoch 20, batch 1450, loss[loss=0.1706, simple_loss=0.2352, pruned_loss=0.05306, over 23641.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2507, pruned_loss=0.04983, over 4733612.73 frames. ], batch size: 256, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:26:38,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:26:38,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=682533.3333333334, ans=0.125 2023-09-30 10:26:41,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:26:44,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:26:44,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:44,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:26:50,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:51,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:26:53,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:53,141 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 10:26:54,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:26:55,061 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682600.0, ans=0.1 2023-09-30 10:26:56,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 10:26:56,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:56,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:26:56,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 10:26:59,429 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:26:59,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:27:00,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 10:27:00,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:03,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:27:04,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:07,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:09,906 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:27:11,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:27:11,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:27:16,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:27:16,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:16,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:17,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:27:17,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:17,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:22,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 10:27:26,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:27:28,427 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=682733.3333333334, ans=0.125 2023-09-30 10:27:31,144 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 10:27:32,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:32,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:27:34,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:36,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 10:27:39,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:41,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 10:27:42,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 10:27:43,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:46,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:27:47,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:51,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 10:27:51,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 10:27:52,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 10:27:54,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:54,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:28:01,508 INFO [train.py:1039] (0/4) Epoch 20, batch 1500, loss[loss=0.1633, simple_loss=0.252, pruned_loss=0.03731, over 24574.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2518, pruned_loss=0.05007, over 4731529.46 frames. ], batch size: 71, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:28:06,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 10:28:07,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:28:07,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:28:09,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:09,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:11,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:28:13,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 10:28:14,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:28:14,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:28:14,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:16,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:28:16,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:28:17,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:24,679 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.891e+02 2.112e+02 2.423e+02 4.358e+02, threshold=4.223e+02, percent-clipped=4.0 2023-09-30 10:28:24,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:24,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 10:28:24,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:28:25,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:28:26,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:29,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 10:28:32,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 10:28:33,162 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=683000.0, ans=0.0 2023-09-30 10:28:35,188 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:36,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 10:28:38,850 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=15.0 2023-09-30 10:28:39,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:28:41,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:28:42,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:42,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:28:44,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 10:28:45,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:28:45,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:46,141 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=683000.0, ans=0.125 2023-09-30 10:28:46,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=683000.0, ans=0.125 2023-09-30 10:28:47,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 10:28:48,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:54,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:28:54,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 10:28:54,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=683066.6666666666, ans=0.125 2023-09-30 10:28:56,804 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.33 vs. limit=22.5 2023-09-30 10:28:59,203 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:29:02,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:29:02,547 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=683066.6666666666, ans=0.1 2023-09-30 10:29:04,020 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=683066.6666666666, ans=0.0 2023-09-30 10:29:06,578 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 10:29:06,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:06,671 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 10:29:09,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:11,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:12,111 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 10:29:12,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:29:16,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 10:29:18,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:21,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:21,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:21,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:22,788 INFO [train.py:1039] (0/4) Epoch 20, batch 1550, loss[loss=0.1812, simple_loss=0.2636, pruned_loss=0.04942, over 24100.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2525, pruned_loss=0.05042, over 4736678.41 frames. ], batch size: 80, lr: 5.17e-03, grad_scale: 8.0 2023-09-30 10:29:22,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:23,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:29:26,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 10:29:26,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 10:29:26,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:29:26,853 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=683200.0, ans=0.125 2023-09-30 10:29:28,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 10:29:28,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 10:29:31,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:33,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:33,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:29:33,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:29:35,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:35,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:36,961 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 10:29:38,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:38,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:29:39,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:29:42,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:29:43,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 10:29:43,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:43,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 10:29:45,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 10:29:45,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 10:29:47,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:49,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:29:52,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:55,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 10:29:55,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 10:30:05,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:08,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:30:08,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:30:08,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:30:10,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 10:30:11,984 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=683400.0, ans=0.125 2023-09-30 10:30:14,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:30:18,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:20,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:30:23,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:30:24,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:24,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 10:30:25,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:26,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:30:28,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:28,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:30:29,826 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 10:30:31,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:38,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 10:30:44,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:45,997 INFO [train.py:1039] (0/4) Epoch 20, batch 1600, loss[loss=0.1649, simple_loss=0.254, pruned_loss=0.03785, over 24311.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2536, pruned_loss=0.05109, over 4728621.27 frames. ], batch size: 74, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:30:46,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:46,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 10:30:47,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:49,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:49,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:30:49,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:30:50,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:30:54,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:54,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 10:30:54,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 10:30:56,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 10:30:58,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:01,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 10:31:03,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:31:04,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:31:09,445 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.873e+02 2.054e+02 2.261e+02 3.957e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 10:31:11,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:31:14,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 10:31:16,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:31:17,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 10:31:19,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:19,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 10:31:21,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=683666.6666666666, ans=0.125 2023-09-30 10:31:25,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 10:31:34,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:35,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 10:31:37,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:37,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:37,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:31:40,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 10:31:45,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 10:31:45,985 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.47 vs. limit=12.0 2023-09-30 10:31:46,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:31:48,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:48,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:48,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:31:52,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:31:53,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:31:55,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:32:00,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:02,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:03,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=683800.0, ans=10.0 2023-09-30 10:32:05,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 10:32:05,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:32:05,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 10:32:08,710 INFO [train.py:1039] (0/4) Epoch 20, batch 1650, loss[loss=0.1455, simple_loss=0.2212, pruned_loss=0.03489, over 24433.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2541, pruned_loss=0.05139, over 4718788.07 frames. ], batch size: 58, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:32:09,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:11,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:32:12,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=683866.6666666666, ans=0.125 2023-09-30 10:32:13,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:32:13,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 10:32:13,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 10:32:13,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 10:32:13,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 10:32:18,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:20,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:20,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:32:20,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:32:21,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:24,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 10:32:27,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:32:27,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:27,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:32:27,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:32:27,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=683933.3333333334, ans=0.0 2023-09-30 10:32:28,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 10:32:28,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 10:32:35,818 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:32:38,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:32:47,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 10:32:48,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:32:52,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 10:32:55,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:32:58,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:32:58,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:58,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:00,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:33:00,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:01,257 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.85 vs. limit=15.0 2023-09-30 10:33:04,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:04,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:06,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:06,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:07,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:10,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:33:12,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:13,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 10:33:15,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:16,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 10:33:19,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 10:33:19,102 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 10:33:19,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:20,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:33:20,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:20,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:20,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 10:33:23,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:24,347 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:33:25,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:33:25,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:26,016 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=684133.3333333334, ans=0.125 2023-09-30 10:33:28,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=684133.3333333334, ans=0.0 2023-09-30 10:33:30,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 10:33:32,085 INFO [train.py:1039] (0/4) Epoch 20, batch 1700, loss[loss=0.1673, simple_loss=0.2287, pruned_loss=0.053, over 23582.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2533, pruned_loss=0.05119, over 4728570.81 frames. ], batch size: 256, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:33:33,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:33,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:33:33,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 10:33:36,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:33:36,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:33:36,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:38,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:33:38,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:33:39,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 10:33:40,338 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=684200.0, ans=0.1 2023-09-30 10:33:41,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:33:51,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:54,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:33:55,784 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.837e+02 2.036e+02 2.305e+02 3.271e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-30 10:33:59,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:33:59,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:00,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:34:00,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:03,842 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:34:05,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 10:34:05,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:34:05,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:09,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:34:11,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:34:12,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 10:34:12,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 10:34:13,067 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:34:14,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:15,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 10:34:17,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:34:27,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:27,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:28,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:30,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:34:30,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 10:34:31,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:33,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:33,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 10:34:34,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:34:34,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:36,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:36,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:34:39,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:39,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:34:41,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:43,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:34:43,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:47,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:48,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 10:34:50,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:51,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:53,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 10:34:55,013 INFO [train.py:1039] (0/4) Epoch 20, batch 1750, loss[loss=0.1815, simple_loss=0.2668, pruned_loss=0.04813, over 24561.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2521, pruned_loss=0.05091, over 4714718.15 frames. ], batch size: 71, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:35:00,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:01,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:02,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=684533.3333333334, ans=0.125 2023-09-30 10:35:03,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:35:03,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 10:35:03,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:35:06,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:35:06,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:11,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 10:35:13,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:17,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 10:35:17,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:19,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:35:20,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:35:22,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 10:35:24,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:35:25,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 10:35:32,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:35:35,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:35:35,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:35,655 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=684666.6666666666, ans=0.0 2023-09-30 10:35:37,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=684666.6666666666, ans=0.0 2023-09-30 10:35:37,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=684666.6666666666, ans=0.125 2023-09-30 10:35:37,973 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.50 vs. limit=22.5 2023-09-30 10:35:38,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:38,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:40,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:35:41,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=684666.6666666666, ans=0.2 2023-09-30 10:35:43,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:46,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:46,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:48,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 10:35:50,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:52,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 10:35:54,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:35:55,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:55,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:35:58,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:36:00,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:36:01,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:01,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:36:02,122 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.47 vs. limit=15.0 2023-09-30 10:36:04,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=684800.0, ans=0.125 2023-09-30 10:36:05,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:08,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:10,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:36:10,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=684800.0, ans=0.125 2023-09-30 10:36:11,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 10:36:11,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:13,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:36:13,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:13,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:36:14,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:36:16,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:36:17,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=684866.6666666666, ans=0.1 2023-09-30 10:36:18,215 INFO [train.py:1039] (0/4) Epoch 20, batch 1800, loss[loss=0.1795, simple_loss=0.2489, pruned_loss=0.05502, over 23504.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2512, pruned_loss=0.05062, over 4711345.94 frames. ], batch size: 120, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:36:18,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:36:20,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:21,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=684866.6666666666, ans=0.125 2023-09-30 10:36:22,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:36:24,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:27,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:36:29,232 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=684866.6666666666, ans=0.2 2023-09-30 10:36:30,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:36:32,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:34,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:35,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:36,742 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.60 vs. limit=15.0 2023-09-30 10:36:37,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:36:37,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:38,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 10:36:39,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:39,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=684933.3333333334, ans=0.035 2023-09-30 10:36:41,811 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.858e+02 2.042e+02 2.295e+02 4.168e+02, threshold=4.085e+02, percent-clipped=1.0 2023-09-30 10:36:42,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:46,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 10:36:48,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 10:36:48,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 10:36:49,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:51,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:51,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:53,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:37:02,096 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 10:37:02,814 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=5.59 vs. limit=15.0 2023-09-30 10:37:03,575 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:37:05,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:07,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 10:37:07,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 10:37:08,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:37:10,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:37:11,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:37:15,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 10:37:22,800 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=685133.3333333334, ans=0.125 2023-09-30 10:37:23,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:37:24,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 10:37:25,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:37:25,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:25,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:37:27,074 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 10:37:30,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:37:30,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:37:33,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 10:37:33,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:36,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:36,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:37:36,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:38,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:39,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:37:41,846 INFO [train.py:1039] (0/4) Epoch 20, batch 1850, loss[loss=0.1736, simple_loss=0.2563, pruned_loss=0.04543, over 24450.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2519, pruned_loss=0.05058, over 4715627.31 frames. ], batch size: 69, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:37:42,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:37:42,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:45,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:37:45,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:37:51,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:37:51,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 10:37:54,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 10:37:56,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=685266.6666666666, ans=0.0 2023-09-30 10:37:59,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 10:38:04,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:04,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 10:38:04,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:38:14,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:38:17,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 10:38:19,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:38:20,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:21,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=685333.3333333334, ans=0.125 2023-09-30 10:38:25,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 10:38:25,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:25,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:38:27,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:38:28,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:38:29,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=685400.0, ans=0.0 2023-09-30 10:38:30,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:38:32,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:38:33,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:34,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:38:34,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:37,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:39,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:38:41,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 10:38:43,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:47,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:38:49,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:38:49,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 10:38:49,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 10:38:50,857 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 10:38:52,867 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 10:38:54,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:38:54,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:54,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:38:55,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:57,450 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 10:38:57,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:38:57,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:57,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=685466.6666666666, ans=0.0 2023-09-30 10:38:59,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:39:00,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:39:02,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:02,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 10:39:03,574 INFO [train.py:1039] (0/4) Epoch 20, batch 1900, loss[loss=0.1888, simple_loss=0.2677, pruned_loss=0.05497, over 24032.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2527, pruned_loss=0.05124, over 4707176.98 frames. ], batch size: 80, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:39:05,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:05,103 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 10:39:05,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:39:06,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:07,404 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.11 vs. limit=15.0 2023-09-30 10:39:09,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=685533.3333333334, ans=0.0 2023-09-30 10:39:10,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=685533.3333333334, ans=0.125 2023-09-30 10:39:11,919 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=685533.3333333334, ans=0.1 2023-09-30 10:39:13,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:16,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:39:17,099 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 10:39:18,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 10:39:18,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:39:20,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:39:20,356 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 10:39:20,413 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 10:39:24,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 10:39:25,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:39:27,723 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.870e+02 2.135e+02 2.444e+02 3.596e+02, threshold=4.270e+02, percent-clipped=0.0 2023-09-30 10:39:29,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 10:39:29,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=685600.0, ans=0.125 2023-09-30 10:39:31,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=685600.0, ans=0.0 2023-09-30 10:39:32,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 10:39:41,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 10:39:45,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 10:39:45,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:45,815 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 10:39:45,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 10:39:46,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=685666.6666666666, ans=0.0 2023-09-30 10:39:47,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 10:39:47,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 10:39:47,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:39:52,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 10:39:56,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:39:59,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:59,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 10:40:01,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:40:03,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 10:40:03,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:10,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:40:10,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:40:10,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:40:12,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:40:13,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:40:15,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:40:15,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:40:18,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:18,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:22,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:40:22,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:40:23,719 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:23,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:28,015 INFO [train.py:1039] (0/4) Epoch 20, batch 1950, loss[loss=0.2577, simple_loss=0.3072, pruned_loss=0.1041, over 19424.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2534, pruned_loss=0.05161, over 4707254.47 frames. ], batch size: 388, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:40:28,208 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:29,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:40:29,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:29,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:40:32,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 10:40:34,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:40:34,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:36,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:39,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:40:39,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:39,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:42,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:40:42,776 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=685933.3333333334, ans=0.0 2023-09-30 10:40:45,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:45,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:40:45,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:40:45,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:48,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:51,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:51,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:51,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:40:51,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 10:40:53,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:40:53,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=685933.3333333334, ans=0.0 2023-09-30 10:40:54,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:40:55,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:57,504 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:40:58,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:02,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:41:05,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:41:07,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:41:07,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:09,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 10:41:09,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:14,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:41:15,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:41:15,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:20,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=686066.6666666666, ans=0.0 2023-09-30 10:41:23,855 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=686066.6666666666, ans=0.125 2023-09-30 10:41:25,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:25,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:28,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:32,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:36,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:41:36,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:37,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 10:41:37,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:41:38,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:39,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 10:41:41,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:41:46,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:47,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:41:47,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:50,537 INFO [train.py:1039] (0/4) Epoch 20, batch 2000, loss[loss=0.1638, simple_loss=0.2489, pruned_loss=0.0393, over 24583.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2543, pruned_loss=0.05231, over 4709726.99 frames. ], batch size: 71, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:41:50,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:41:53,641 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:55,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 10:41:56,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:58,480 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=686200.0, ans=0.1 2023-09-30 10:41:59,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:42:02,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 10:42:03,690 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.40 vs. limit=22.5 2023-09-30 10:42:04,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:42:04,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:42:05,448 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=686266.6666666666, ans=0.0 2023-09-30 10:42:06,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:42:08,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 10:42:09,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:11,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:11,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:13,100 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.882e+02 2.052e+02 2.299e+02 3.277e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 10:42:13,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 10:42:13,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:42:16,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 10:42:16,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:20,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:42:21,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:42:21,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:23,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:23,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=686333.3333333334, ans=0.0 2023-09-30 10:42:24,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:26,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 10:42:29,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 10:42:29,263 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:29,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:42:35,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:36,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:42:36,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:38,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:42:39,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:39,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:42,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:42,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:44,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:48,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:48,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 10:42:56,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:42:56,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:01,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:01,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:43:05,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:08,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:08,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:08,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:43:08,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:43:10,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:10,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:11,745 INFO [train.py:1039] (0/4) Epoch 20, batch 2050, loss[loss=0.1676, simple_loss=0.2469, pruned_loss=0.04416, over 23661.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2533, pruned_loss=0.05167, over 4721476.96 frames. ], batch size: 149, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:43:13,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:14,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:15,532 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=686533.3333333334, ans=0.125 2023-09-30 10:43:20,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:43:23,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:43:24,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:25,905 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:43:27,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 10:43:27,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:43:31,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:43:31,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:43:31,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=686600.0, ans=0.2 2023-09-30 10:43:36,266 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=686600.0, ans=0.125 2023-09-30 10:43:39,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:39,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=686600.0, ans=0.125 2023-09-30 10:43:40,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:44,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 10:43:45,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:47,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 10:43:48,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:51,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:55,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:43:55,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=686666.6666666666, ans=0.5 2023-09-30 10:43:57,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:43:57,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:59,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:44:00,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:44:00,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:44:05,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:06,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:44:10,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:44:10,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:44:13,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:18,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:44:19,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 10:44:25,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:26,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:44:29,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:44:32,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 10:44:34,064 INFO [train.py:1039] (0/4) Epoch 20, batch 2100, loss[loss=0.1851, simple_loss=0.2506, pruned_loss=0.05981, over 23652.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2519, pruned_loss=0.05108, over 4726756.50 frames. ], batch size: 135, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:44:35,784 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 10:44:35,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:35,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:37,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:44:37,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:37,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 10:44:37,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 10:44:39,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:43,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:44:44,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:44:47,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:47,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:44:47,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 10:44:49,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:44:49,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 10:44:49,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 10:44:52,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:44:52,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:44:52,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 10:44:52,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 10:44:53,997 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=686933.3333333334, ans=0.0 2023-09-30 10:44:58,446 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.064e+02 2.437e+02 3.000e+02 4.850e+02, threshold=4.873e+02, percent-clipped=5.0 2023-09-30 10:44:58,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 10:44:58,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:45:02,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:02,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=686933.3333333334, ans=0.95 2023-09-30 10:45:03,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:45:07,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:45:07,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 10:45:09,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:09,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:45:12,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 10:45:13,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:13,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 10:45:13,806 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 10:45:13,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 10:45:17,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:45:19,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:45:22,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:26,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:26,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 10:45:26,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:26,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:28,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:28,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 10:45:30,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 10:45:30,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 10:45:33,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:45:37,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:45:39,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 10:45:44,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:47,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:45:48,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:45:49,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:45:49,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:45:50,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:45:52,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:52,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:45:52,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:45:52,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:55,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 10:45:57,398 INFO [train.py:1039] (0/4) Epoch 20, batch 2150, loss[loss=0.1614, simple_loss=0.239, pruned_loss=0.04191, over 24317.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2502, pruned_loss=0.05061, over 4717500.32 frames. ], batch size: 61, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:45:57,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 10:45:57,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:59,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:59,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:45:59,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:45:59,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:46:02,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=687200.0, ans=0.125 2023-09-30 10:46:05,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:46:06,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:07,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:10,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:46:10,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:10,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:46:13,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:15,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:46:15,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:46:21,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:21,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 10:46:26,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:28,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:46:28,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:28,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:29,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:29,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:46:31,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:31,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:46:31,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:46:32,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 10:46:35,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:46:35,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:35,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:37,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:46:38,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:46:42,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:42,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:46:45,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:45,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 10:46:48,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:46:49,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:51,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:52,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:54,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:46:54,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:46:56,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:56,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 10:46:58,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 10:46:58,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:47:00,935 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 10:47:01,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:01,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:02,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 10:47:02,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:47:02,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 10:47:02,561 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 10:47:02,561 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 10:47:02,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 10:47:04,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:04,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:47:04,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:47:05,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:07,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:47:08,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:08,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:18,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:47:18,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 10:47:20,092 INFO [train.py:1039] (0/4) Epoch 20, batch 2200, loss[loss=0.1712, simple_loss=0.2431, pruned_loss=0.04965, over 24324.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2511, pruned_loss=0.05082, over 4721762.10 frames. ], batch size: 61, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:47:24,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:47:27,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=687533.3333333334, ans=0.125 2023-09-30 10:47:28,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:30,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:47:30,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:47:32,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:47:35,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:35,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:47:35,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 10:47:40,085 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.43 vs. limit=22.5 2023-09-30 10:47:40,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 10:47:42,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:47:45,301 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.904e+02 2.106e+02 2.500e+02 4.276e+02, threshold=4.212e+02, percent-clipped=0.0 2023-09-30 10:47:45,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 10:47:48,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:50,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:47:50,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:53,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:47:53,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 10:47:59,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:48:01,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:01,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:48:03,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=687666.6666666666, ans=0.125 2023-09-30 10:48:04,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:48:06,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:09,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:48:11,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:12,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 10:48:14,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:15,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 10:48:18,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:18,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:48:18,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:21,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:48:23,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:23,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:23,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:25,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:48:25,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:48:28,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:48:31,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:48:31,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:48:35,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:48:35,671 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 10:48:38,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:48:38,878 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 10:48:40,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:48:41,825 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 10:48:43,247 INFO [train.py:1039] (0/4) Epoch 20, batch 2250, loss[loss=0.1724, simple_loss=0.2425, pruned_loss=0.05116, over 23752.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2518, pruned_loss=0.05072, over 4730790.37 frames. ], batch size: 135, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:48:43,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:43,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:48:45,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:47,159 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 10:48:48,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:48:50,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:48:56,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:48:57,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:49:01,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:03,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:03,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:49:07,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 10:49:07,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:07,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:49:08,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 10:49:09,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:49:10,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:11,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:17,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:18,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:49:18,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:49:21,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 10:49:21,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:23,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:49:25,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=688000.0, ans=0.0 2023-09-30 10:49:26,056 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.81 vs. limit=22.5 2023-09-30 10:49:27,115 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=688000.0, ans=0.2 2023-09-30 10:49:28,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:28,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:31,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:49:31,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:34,012 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=688066.6666666666, ans=0.125 2023-09-30 10:49:35,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:36,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:49:38,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:49:42,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:49:46,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:49:47,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:49:47,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:49:52,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:49:55,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:49:55,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 10:49:55,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:49:56,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:49:57,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=688133.3333333334, ans=0.0 2023-09-30 10:49:59,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 10:50:01,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:50:03,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:06,371 INFO [train.py:1039] (0/4) Epoch 20, batch 2300, loss[loss=0.1835, simple_loss=0.2512, pruned_loss=0.05794, over 23951.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2534, pruned_loss=0.05113, over 4724488.86 frames. ], batch size: 196, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:50:10,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:10,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:50:11,717 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 10:50:14,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:21,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:50:21,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:50:23,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:50:23,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:23,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 10:50:25,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:50:26,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:28,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:50:30,452 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-09-30 10:50:30,974 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.830e+02 1.981e+02 2.237e+02 3.602e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 10:50:32,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:50:34,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:50:37,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:50:42,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:50:43,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:46,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:50:47,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:47,886 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=688333.3333333334, ans=0.0 2023-09-30 10:50:48,588 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.69 vs. limit=12.0 2023-09-30 10:50:52,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:54,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:50:54,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:50:54,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 10:50:57,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:50:57,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:00,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:00,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:51:01,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:04,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 10:51:04,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:51:04,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 10:51:04,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:51:04,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:06,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 10:51:14,335 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:51:18,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:51:22,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:22,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:51:22,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:51:25,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:51:25,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:51:27,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:51:28,887 INFO [train.py:1039] (0/4) Epoch 20, batch 2350, loss[loss=0.1745, simple_loss=0.2629, pruned_loss=0.04309, over 24659.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2535, pruned_loss=0.0511, over 4730669.41 frames. ], batch size: 73, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:51:29,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 10:51:34,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=688533.3333333334, ans=0.2 2023-09-30 10:51:35,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:51:35,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 10:51:35,873 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=688533.3333333334, ans=0.125 2023-09-30 10:51:43,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 10:51:46,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:49,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:49,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:49,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:51:50,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:51:53,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 10:51:56,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:52:01,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 10:52:02,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:52:04,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:52:04,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:52:04,591 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=688666.6666666666, ans=0.1 2023-09-30 10:52:05,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=688666.6666666666, ans=0.0 2023-09-30 10:52:08,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:52:10,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 10:52:10,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:52:13,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:52:13,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:14,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:52:17,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:52:19,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 10:52:20,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:52:24,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:52:24,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:52:26,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 10:52:26,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:52:29,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 10:52:29,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:52:34,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 10:52:34,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=688800.0, ans=0.04949747468305833 2023-09-30 10:52:38,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 10:52:38,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:39,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:52:39,601 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 10:52:40,688 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.87 vs. limit=6.0 2023-09-30 10:52:41,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 10:52:43,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 10:52:46,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:52:46,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=688800.0, ans=0.125 2023-09-30 10:52:50,533 INFO [train.py:1039] (0/4) Epoch 20, batch 2400, loss[loss=0.1683, simple_loss=0.2565, pruned_loss=0.04004, over 24649.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2535, pruned_loss=0.05076, over 4736642.85 frames. ], batch size: 73, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:52:52,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:52:55,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:52:57,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:52:58,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 10:52:59,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 10:53:02,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=688866.6666666666, ans=0.0 2023-09-30 10:53:08,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:53:08,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:09,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=688933.3333333334, ans=0.2 2023-09-30 10:53:10,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 10:53:11,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:53:11,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:13,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 10:53:14,618 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.889e+02 2.089e+02 2.400e+02 4.035e+02, threshold=4.178e+02, percent-clipped=1.0 2023-09-30 10:53:18,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:22,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 10:53:27,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:53:31,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 10:53:35,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:53:35,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:40,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:53:42,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 10:53:42,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:53:50,250 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:52,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:53:56,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:56,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:53:56,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:53:57,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:53:57,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:57,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:53:57,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:54:01,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:02,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:54:02,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 10:54:04,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 10:54:07,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:54:07,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:54:09,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 10:54:09,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 10:54:09,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 10:54:09,141 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 10:54:10,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 10:54:10,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:54:12,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:12,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:14,372 INFO [train.py:1039] (0/4) Epoch 20, batch 2450, loss[loss=0.1705, simple_loss=0.2575, pruned_loss=0.04178, over 24690.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.253, pruned_loss=0.05097, over 4726556.94 frames. ], batch size: 68, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:54:14,542 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 10:54:15,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:16,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:54:20,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:54:20,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:24,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:24,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:54:26,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 10:54:26,754 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=689200.0, ans=0.5 2023-09-30 10:54:32,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:54:32,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:35,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:54:35,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:54:35,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:54:35,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 10:54:41,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=689266.6666666666, ans=0.025 2023-09-30 10:54:42,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:45,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:54:45,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:48,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=689333.3333333334, ans=0.125 2023-09-30 10:54:50,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:54:50,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:53,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 10:54:55,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:54:59,594 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=689333.3333333334, ans=0.125 2023-09-30 10:55:02,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:05,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:55:05,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:06,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:55:06,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:08,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:55:08,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 10:55:13,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:55:13,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:55:16,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:55:16,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:22,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:55:22,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 10:55:25,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:55:25,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:55:25,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 10:55:26,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:55:28,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:55:31,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:55:33,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:33,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:55:36,684 INFO [train.py:1039] (0/4) Epoch 20, batch 2500, loss[loss=0.1861, simple_loss=0.2723, pruned_loss=0.04999, over 24419.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.252, pruned_loss=0.05074, over 4729407.76 frames. ], batch size: 77, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:55:37,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 10:55:38,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:55:44,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:47,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=689533.3333333334, ans=0.0 2023-09-30 10:55:52,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=689600.0, ans=0.0 2023-09-30 10:55:55,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:55:55,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:56,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:56,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 10:56:01,699 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.945e+02 2.213e+02 2.493e+02 3.500e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-30 10:56:03,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:56:03,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:03,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=689600.0, ans=0.05 2023-09-30 10:56:05,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:56:05,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:56:05,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 10:56:05,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=689600.0, ans=0.0 2023-09-30 10:56:07,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:07,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:08,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 10:56:08,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:10,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 10:56:10,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:12,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=689666.6666666666, ans=0.125 2023-09-30 10:56:15,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:56:15,778 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=689666.6666666666, ans=0.125 2023-09-30 10:56:16,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:20,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:56:20,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 10:56:22,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:56:22,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:26,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:31,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:35,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:40,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:56:43,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 10:56:43,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:43,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:56:46,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:56:46,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:56:48,467 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 10:56:48,467 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 10:56:48,476 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 10:56:53,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:55,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 10:56:55,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 10:56:56,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:56,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 10:56:59,587 INFO [train.py:1039] (0/4) Epoch 20, batch 2550, loss[loss=0.1755, simple_loss=0.249, pruned_loss=0.05094, over 23627.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2517, pruned_loss=0.05039, over 4717228.76 frames. ], batch size: 149, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:56:59,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 10:57:03,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=689866.6666666666, ans=0.125 2023-09-30 10:57:04,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:05,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:57:07,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:57:07,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=689866.6666666666, ans=0.0 2023-09-30 10:57:10,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:12,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 10:57:12,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:57:15,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 10:57:15,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:57:19,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:22,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:57:22,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 10:57:23,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:24,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:24,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:27,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:57:27,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 10:57:28,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:57:28,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:28,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 10:57:41,846 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=690000.0, ans=0.0 2023-09-30 10:57:43,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:57:46,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:57:47,103 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=690066.6666666666, ans=0.125 2023-09-30 10:57:48,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:48,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:49,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:57:55,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:58,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:58,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:57:58,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:58:00,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:58:00,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:58:05,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:05,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:11,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:58:11,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 10:58:11,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:58:11,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:13,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:58:14,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:58:15,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:21,655 INFO [train.py:1039] (0/4) Epoch 20, batch 2600, loss[loss=0.204, simple_loss=0.2803, pruned_loss=0.06386, over 24048.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2522, pruned_loss=0.05027, over 4712857.84 frames. ], batch size: 80, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:58:23,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:58:26,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:28,794 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 10:58:29,134 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=690200.0, ans=0.0 2023-09-30 10:58:31,779 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 10:58:31,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:58:32,539 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 10:58:33,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 10:58:33,876 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 10:58:34,730 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.62 vs. limit=15.0 2023-09-30 10:58:38,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:38,250 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 10:58:40,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 10:58:40,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=690266.6666666666, ans=0.0 2023-09-30 10:58:42,032 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 10:58:43,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:58:45,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 10:58:45,458 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=690266.6666666666, ans=0.125 2023-09-30 10:58:46,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 10:58:46,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:58:48,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 10:58:49,610 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.870e+02 2.130e+02 2.382e+02 3.027e+02, threshold=4.260e+02, percent-clipped=0.0 2023-09-30 10:58:51,231 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 10:58:51,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 10:58:53,841 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=690333.3333333334, ans=0.125 2023-09-30 10:58:56,917 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=690333.3333333334, ans=0.125 2023-09-30 10:58:58,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:58:59,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:59,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:58:59,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 10:59:03,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:59:05,268 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=690333.3333333334, ans=0.0 2023-09-30 10:59:08,030 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=690333.3333333334, ans=0.1 2023-09-30 10:59:09,241 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 10:59:11,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=690400.0, ans=0.125 2023-09-30 10:59:14,730 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=690400.0, ans=0.0 2023-09-30 10:59:15,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:15,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:16,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 10:59:16,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:16,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:59:18,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 10:59:21,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:59:22,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:59:24,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:24,907 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=690400.0, ans=0.125 2023-09-30 10:59:27,946 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 10:59:29,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:29,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:59:32,675 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=690466.6666666666, ans=0.0 2023-09-30 10:59:34,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:34,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:59:34,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 10:59:36,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:39,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:59:39,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:59:43,976 INFO [train.py:1039] (0/4) Epoch 20, batch 2650, loss[loss=0.1785, simple_loss=0.2488, pruned_loss=0.0541, over 23427.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2531, pruned_loss=0.04991, over 4725630.84 frames. ], batch size: 119, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:59:44,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 10:59:45,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:49,481 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:59:54,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 10:59:54,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:55,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:59:57,194 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 10:59:57,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:00,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:01,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:00:04,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:00:06,262 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:00:07,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 11:00:07,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:00:07,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:00:09,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 11:00:11,748 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 11:00:16,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:19,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 11:00:19,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:21,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 11:00:23,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:23,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:00:23,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:25,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:29,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 11:00:29,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 11:00:33,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:00:36,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 11:00:38,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:38,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:38,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:00:39,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:39,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:41,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:43,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:44,011 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.71 vs. limit=15.0 2023-09-30 11:00:44,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:46,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:00:47,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:00:49,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:49,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:00:49,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=690800.0, ans=0.2 2023-09-30 11:00:52,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:52,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:52,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:00:56,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:56,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=690800.0, ans=0.125 2023-09-30 11:00:57,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:00:57,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:57,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 11:01:01,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:03,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:03,775 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.74 vs. limit=10.0 2023-09-30 11:01:05,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:06,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:06,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:01:08,176 INFO [train.py:1039] (0/4) Epoch 20, batch 2700, loss[loss=0.1796, simple_loss=0.2443, pruned_loss=0.05743, over 23449.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2545, pruned_loss=0.05084, over 4722352.07 frames. ], batch size: 285, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 11:01:08,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:11,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:11,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 11:01:12,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:01:14,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:01:16,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:01:16,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:16,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:19,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:01:19,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:01:19,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:01:19,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:01:19,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 11:01:21,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:01:22,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:01:24,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:01:24,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:28,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:01:30,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 11:01:30,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:01:35,954 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.889e+02 2.106e+02 2.437e+02 3.198e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 11:01:36,755 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.41 vs. limit=15.0 2023-09-30 11:01:37,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:01:37,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:01:38,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=690933.3333333334, ans=0.1 2023-09-30 11:01:45,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:01:45,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:45,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:01:46,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:01:47,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:01:51,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:51,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:01:51,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:01:56,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:56,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:01:56,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=691066.6666666666, ans=0.1 2023-09-30 11:02:04,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:02:04,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:07,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=691066.6666666666, ans=0.2 2023-09-30 11:02:08,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:02:08,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:10,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:12,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:13,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:02:13,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=691133.3333333334, ans=0.125 2023-09-30 11:02:15,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:16,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:16,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:19,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:02:19,908 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=691133.3333333334, ans=0.0 2023-09-30 11:02:21,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:21,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:24,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 11:02:26,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:28,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:02:28,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 11:02:29,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 11:02:29,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:30,973 INFO [train.py:1039] (0/4) Epoch 20, batch 2750, loss[loss=0.1646, simple_loss=0.2221, pruned_loss=0.05359, over 22684.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2541, pruned_loss=0.05114, over 4709054.54 frames. ], batch size: 322, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:02:34,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:02:34,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:36,349 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=691200.0, ans=0.0 2023-09-30 11:02:37,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:38,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:02:39,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:42,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:02:42,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:02:44,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:02:44,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:44,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 11:02:44,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:02:44,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:50,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 11:02:53,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:55,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:55,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:55,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:02:56,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:58,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:03:00,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:00,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:03,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:03:03,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:03:04,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:03:06,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:06,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:03:13,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:14,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:03:14,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:15,334 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=691333.3333333334, ans=0.0 2023-09-30 11:03:20,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:20,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:03:20,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:03:29,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:03:29,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:03:29,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 11:03:29,636 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=691400.0, ans=0.1 2023-09-30 11:03:34,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=691466.6666666666, ans=0.09899494936611666 2023-09-30 11:03:35,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:37,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 11:03:42,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:03:42,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=691466.6666666666, ans=0.125 2023-09-30 11:03:45,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:03:45,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 11:03:47,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:03:49,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:03:49,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 11:03:49,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=691466.6666666666, ans=0.125 2023-09-30 11:03:50,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:03:52,274 INFO [train.py:1039] (0/4) Epoch 20, batch 2800, loss[loss=0.186, simple_loss=0.2488, pruned_loss=0.06162, over 23896.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2534, pruned_loss=0.05067, over 4706072.86 frames. ], batch size: 195, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:03:53,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:03:53,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:03:53,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:03:55,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 11:03:55,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:57,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:59,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:59,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 11:03:59,312 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 11:04:01,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:04,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:04:04,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:04:07,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:04:10,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 11:04:12,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:04:13,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 11:04:13,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:15,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:04:15,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:17,521 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=691600.0, ans=0.2 2023-09-30 11:04:19,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=691600.0, ans=0.125 2023-09-30 11:04:20,057 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.934e+02 2.198e+02 2.522e+02 3.773e+02, threshold=4.395e+02, percent-clipped=0.0 2023-09-30 11:04:20,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:21,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:21,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:04:21,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:04:25,700 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=691666.6666666666, ans=0.0 2023-09-30 11:04:30,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:04:32,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:34,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:35,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:04:36,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:43,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:43,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 11:04:45,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:45,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:45,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:04:49,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:50,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:55,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:57,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:04:57,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:57,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:04:57,789 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.74 vs. limit=12.0 2023-09-30 11:04:58,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:05:00,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:05:00,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:05:00,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 11:05:00,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:05:02,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 11:05:04,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:04,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:05:06,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:05:07,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 11:05:15,442 INFO [train.py:1039] (0/4) Epoch 20, batch 2850, loss[loss=0.1631, simple_loss=0.2396, pruned_loss=0.04332, over 23339.00 frames. ], tot_loss[loss=0.176, simple_loss=0.252, pruned_loss=0.04994, over 4706049.92 frames. ], batch size: 93, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:05:15,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:05:15,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:05:16,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:05:20,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:23,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:05:23,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:05:25,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:05:28,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:29,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:05:31,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:05:31,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 11:05:34,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=691933.3333333334, ans=0.125 2023-09-30 11:05:35,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=691933.3333333334, ans=0.1 2023-09-30 11:05:37,369 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:05:38,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 11:05:38,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:40,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 11:05:40,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:43,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 11:05:43,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=691933.3333333334, ans=0.0 2023-09-30 11:05:45,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 11:05:47,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:52,113 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=692000.0, ans=0.0 2023-09-30 11:06:00,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:01,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:01,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:06:01,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:06:03,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:06:03,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:06:06,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:06:06,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 11:06:07,346 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.23 vs. limit=15.0 2023-09-30 11:06:09,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:06:09,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:09,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:11,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:12,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:12,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:16,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:18,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:19,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:06:21,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:21,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:25,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:06:28,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:06:30,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 11:06:30,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 11:06:33,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:06:34,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:34,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 11:06:35,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:06:35,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:36,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:36,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:06:36,523 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 11:06:36,611 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 11:06:36,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:37,966 INFO [train.py:1039] (0/4) Epoch 20, batch 2900, loss[loss=0.1593, simple_loss=0.2465, pruned_loss=0.03608, over 24448.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2516, pruned_loss=0.04948, over 4715840.17 frames. ], batch size: 63, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:06:38,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:44,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:06:44,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:44,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:45,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 11:06:49,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:51,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 11:06:51,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 11:06:53,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:06:53,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:06:55,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:55,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:58,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:59,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:07:02,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:07:04,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 11:07:05,483 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.792e+02 1.963e+02 2.139e+02 2.917e+02, threshold=3.926e+02, percent-clipped=0.0 2023-09-30 11:07:05,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:07:07,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:09,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 11:07:11,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 11:07:14,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:07:14,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 11:07:14,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:07:16,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:07:16,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:07:19,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:07:20,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:24,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:07:29,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:07:31,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 11:07:33,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 11:07:33,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:07:33,571 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=692400.0, ans=0.0 2023-09-30 11:07:33,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=692400.0, ans=0.125 2023-09-30 11:07:36,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:07:39,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 11:07:40,975 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:07:46,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:55,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:07:55,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:07:57,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 11:07:59,380 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.56 vs. limit=6.0 2023-09-30 11:08:00,652 INFO [train.py:1039] (0/4) Epoch 20, batch 2950, loss[loss=0.1558, simple_loss=0.231, pruned_loss=0.04034, over 24364.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2519, pruned_loss=0.04949, over 4720162.41 frames. ], batch size: 56, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:08:00,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:00,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 11:08:02,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:03,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:08:08,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:09,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 11:08:11,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:11,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:14,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:14,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:08:15,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 11:08:15,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 11:08:17,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:08:17,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:22,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:24,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:24,468 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=692600.0, ans=0.1 2023-09-30 11:08:27,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:08:28,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:30,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=692600.0, ans=0.2 2023-09-30 11:08:33,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:08:33,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:08:35,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:08:38,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 11:08:42,558 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=692666.6666666666, ans=0.125 2023-09-30 11:08:42,703 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=692666.6666666666, ans=0.2 2023-09-30 11:08:42,947 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.09 vs. limit=15.0 2023-09-30 11:08:43,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 11:08:43,912 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 11:08:45,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:08:46,890 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 11:08:47,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=692666.6666666666, ans=0.2 2023-09-30 11:08:48,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 11:08:48,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:50,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:50,028 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 11:08:50,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:08:50,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=692733.3333333334, ans=0.125 2023-09-30 11:08:52,354 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.37 vs. limit=22.5 2023-09-30 11:08:53,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 11:08:55,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:55,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:08:57,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=692733.3333333334, ans=0.0 2023-09-30 11:08:58,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:59,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:09:01,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:01,247 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 11:09:01,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:09:01,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 11:09:02,034 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.70 vs. limit=15.0 2023-09-30 11:09:08,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:10,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:10,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 11:09:10,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:09:13,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 11:09:16,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:16,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:09:16,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:09:18,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:19,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:09:21,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:09:22,918 INFO [train.py:1039] (0/4) Epoch 20, batch 3000, loss[loss=0.1986, simple_loss=0.2632, pruned_loss=0.06701, over 23730.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2526, pruned_loss=0.05031, over 4721731.53 frames. ], batch size: 179, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:09:22,919 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 11:09:37,401 INFO [train.py:1071] (0/4) Epoch 20, validation: loss=0.3156, simple_loss=0.2725, pruned_loss=0.1794, over 1125622.00 frames. 2023-09-30 11:09:37,402 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20954MB 2023-09-30 11:09:37,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:37,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:09:37,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:09:39,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:40,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:09:40,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:40,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 11:09:41,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:43,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:09:44,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:09:48,488 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 11:09:49,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 11:09:51,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:52,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:09:54,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 11:09:54,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:10:00,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:10:00,816 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=692933.3333333334, ans=0.09899494936611666 2023-09-30 11:10:05,478 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.867e+02 2.114e+02 2.609e+02 3.839e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 11:10:10,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:10:14,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=693000.0, ans=0.0 2023-09-30 11:10:15,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 11:10:17,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:10:21,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:10:22,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:10:23,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:10:26,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:26,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 11:10:27,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 11:10:29,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:10:30,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:10:32,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:10:32,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:33,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:33,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:10:39,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:10:39,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:39,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:10:40,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:42,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 11:10:42,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:10:42,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:10:42,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:10:47,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:47,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:49,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:10:49,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 11:10:49,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:10:51,374 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 11:10:51,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:10:54,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 11:10:57,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:10:58,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:10:58,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 11:11:00,425 INFO [train.py:1039] (0/4) Epoch 20, batch 3050, loss[loss=0.1546, simple_loss=0.2453, pruned_loss=0.03195, over 24633.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2528, pruned_loss=0.05047, over 4718965.64 frames. ], batch size: 73, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:11:00,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 11:11:00,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:11:01,383 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.99 vs. limit=22.5 2023-09-30 11:11:01,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:11:03,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:11:03,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:11:03,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:03,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:11:06,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 11:11:08,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:10,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:10,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:11:15,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:17,186 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.38 vs. limit=15.0 2023-09-30 11:11:18,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 11:11:23,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 11:11:23,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 11:11:25,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:28,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:11:32,175 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-104000.pt 2023-09-30 11:11:35,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:35,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:35,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:38,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:11:38,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:11:40,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:40,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:40,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:42,259 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:45,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:46,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:48,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 11:11:48,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:48,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:11:52,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:54,431 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:11:54,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:11:56,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:11:58,532 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.55 vs. limit=12.0 2023-09-30 11:12:00,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:12:00,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:03,274 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=693400.0, ans=0.0 2023-09-30 11:12:09,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:10,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:10,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:12,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:12,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:12:12,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:12:13,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 11:12:15,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:15,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:17,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 11:12:18,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:23,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:23,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=693533.3333333334, ans=0.5 2023-09-30 11:12:23,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=693533.3333333334, ans=0.125 2023-09-30 11:12:25,030 INFO [train.py:1039] (0/4) Epoch 20, batch 3100, loss[loss=0.163, simple_loss=0.2228, pruned_loss=0.05154, over 22586.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2526, pruned_loss=0.04987, over 4736587.76 frames. ], batch size: 322, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:12:26,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:12:28,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:12:30,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 11:12:31,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 11:12:33,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 11:12:35,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:12:39,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:12:39,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:40,023 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=693533.3333333334, ans=0.125 2023-09-30 11:12:42,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:12:47,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:52,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 11:12:55,876 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.886e+02 2.147e+02 2.525e+02 3.564e+02, threshold=4.295e+02, percent-clipped=0.0 2023-09-30 11:12:57,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:12:57,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:57,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:12:59,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:59,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:13:00,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:13:02,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 11:13:02,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:13:03,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:05,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 11:13:06,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:13:11,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=693666.6666666666, ans=0.125 2023-09-30 11:13:12,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:13:12,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=693666.6666666666, ans=0.125 2023-09-30 11:13:13,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 11:13:14,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 11:13:16,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:18,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:20,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:20,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:20,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:13:22,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:13:22,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:13:23,030 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.44 vs. limit=15.0 2023-09-30 11:13:24,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:13:24,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:13:24,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:24,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:13:24,460 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=693733.3333333334, ans=0.125 2023-09-30 11:13:24,903 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-09-30 11:13:29,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:13:30,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 11:13:32,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:13:33,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 11:13:35,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:35,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:35,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 11:13:38,743 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=693800.0, ans=0.0 2023-09-30 11:13:48,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 11:13:49,311 INFO [train.py:1039] (0/4) Epoch 20, batch 3150, loss[loss=0.169, simple_loss=0.2344, pruned_loss=0.05178, over 23579.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2516, pruned_loss=0.04962, over 4732244.42 frames. ], batch size: 256, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:13:51,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:51,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:51,526 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=693866.6666666666, ans=0.1 2023-09-30 11:13:52,803 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:13:52,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:13:52,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 11:13:54,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:54,871 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=693866.6666666666, ans=0.125 2023-09-30 11:13:55,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:13:57,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 11:13:57,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=693866.6666666666, ans=0.025 2023-09-30 11:13:59,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:01,592 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 11:14:01,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=693866.6666666666, ans=0.09899494936611666 2023-09-30 11:14:04,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 11:14:05,047 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=693933.3333333334, ans=10.0 2023-09-30 11:14:06,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:06,302 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 11:14:06,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=693933.3333333334, ans=0.1 2023-09-30 11:14:07,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:14:09,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 11:14:09,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 11:14:09,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 11:14:09,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:09,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:10,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:11,180 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:14:13,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 11:14:15,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:15,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:17,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:20,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:14:25,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 11:14:26,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:14:29,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:14:30,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:30,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 11:14:33,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 11:14:35,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:14:35,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:14:36,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:14:36,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:36,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:14:37,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:14:37,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:14:39,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 11:14:40,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:14:40,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:41,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=694066.6666666666, ans=0.0 2023-09-30 11:14:42,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:14:42,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:42,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=694066.6666666666, ans=0.0 2023-09-30 11:14:43,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 11:14:43,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:44,933 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.86 vs. limit=10.0 2023-09-30 11:14:45,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 11:14:46,383 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.64 vs. limit=15.0 2023-09-30 11:14:47,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:47,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 11:14:47,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 11:14:50,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:14:50,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:51,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 11:14:52,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 11:14:52,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:56,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:57,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:57,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:15:01,853 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=694133.3333333334, ans=0.125 2023-09-30 11:15:04,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=694133.3333333334, ans=0.125 2023-09-30 11:15:05,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:15:05,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:08,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 11:15:11,652 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=694200.0, ans=0.125 2023-09-30 11:15:12,660 INFO [train.py:1039] (0/4) Epoch 20, batch 3200, loss[loss=0.1655, simple_loss=0.2424, pruned_loss=0.04433, over 24292.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2513, pruned_loss=0.0493, over 4734435.94 frames. ], batch size: 56, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:15:12,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:15:12,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 11:15:12,916 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=694200.0, ans=0.125 2023-09-30 11:15:18,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:20,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:15:20,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 11:15:21,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:15:25,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:15:30,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:35,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=694266.6666666666, ans=0.0 2023-09-30 11:15:39,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:15:42,125 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.858e+02 2.103e+02 2.505e+02 4.292e+02, threshold=4.206e+02, percent-clipped=0.0 2023-09-30 11:15:43,009 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.14 vs. limit=15.0 2023-09-30 11:15:49,366 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.04 vs. limit=15.0 2023-09-30 11:15:50,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 11:15:51,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:15:54,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 11:15:54,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:15:55,387 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.62 vs. limit=12.0 2023-09-30 11:15:58,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:15:58,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:16:00,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:16:03,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 11:16:05,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=694400.0, ans=0.0 2023-09-30 11:16:06,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:16:08,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 11:16:13,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 11:16:14,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=694400.0, ans=0.1 2023-09-30 11:16:15,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:16:15,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=694400.0, ans=0.0 2023-09-30 11:16:22,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:22,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:16:23,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:23,091 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 11:16:23,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:16:26,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:27,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 11:16:29,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 11:16:29,512 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=694466.6666666666, ans=0.2 2023-09-30 11:16:30,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 11:16:30,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 11:16:33,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:16:35,053 INFO [train.py:1039] (0/4) Epoch 20, batch 3250, loss[loss=0.1847, simple_loss=0.2507, pruned_loss=0.05936, over 23814.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.251, pruned_loss=0.04917, over 4737557.34 frames. ], batch size: 179, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:16:36,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:16:36,625 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 11:16:36,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:16:36,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:16:39,541 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 11:16:39,901 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=694533.3333333334, ans=0.0 2023-09-30 11:16:44,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:16:49,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:16:56,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:16:56,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 11:16:57,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:57,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:57,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:16:59,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:16:59,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:17:02,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:02,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:17:02,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:04,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:10,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:10,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=694666.6666666666, ans=0.0 2023-09-30 11:17:11,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:17:12,096 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=694666.6666666666, ans=0.2 2023-09-30 11:17:14,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:14,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:15,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:15,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:17:15,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:19,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 11:17:21,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:17:21,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:17:21,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=694666.6666666666, ans=0.2 2023-09-30 11:17:22,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:22,999 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=694733.3333333334, ans=0.125 2023-09-30 11:17:24,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:17:31,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:17:32,190 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-09-30 11:17:39,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:17:39,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:39,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 11:17:39,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:17:39,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:17:39,705 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=694800.0, ans=0.125 2023-09-30 11:17:41,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:43,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 11:17:44,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 11:17:44,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:47,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:48,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:48,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:17:48,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:53,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:54,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:17:56,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 11:17:56,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:17:57,613 INFO [train.py:1039] (0/4) Epoch 20, batch 3300, loss[loss=0.176, simple_loss=0.2617, pruned_loss=0.04514, over 24616.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2517, pruned_loss=0.04967, over 4726064.99 frames. ], batch size: 68, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:17:57,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:17:57,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 11:18:01,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:18:01,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 11:18:05,133 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 11:18:05,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 11:18:06,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:11,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:18:12,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:18:12,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:12,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:18:14,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:18:16,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:17,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:18:24,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 11:18:24,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:24,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:26,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:27,594 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 11:18:27,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:18:29,015 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.888e+02 2.046e+02 2.323e+02 3.227e+02, threshold=4.091e+02, percent-clipped=0.0 2023-09-30 11:18:29,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:18:30,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:18:30,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:18:30,733 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 11:18:37,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:37,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:18:39,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:39,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 11:18:40,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 11:18:41,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:42,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:18:44,031 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 11:18:45,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 11:18:45,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:18:47,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 11:18:47,610 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=695066.6666666666, ans=0.125 2023-09-30 11:18:50,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:18:53,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:18:53,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:18:55,902 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.18 vs. limit=22.5 2023-09-30 11:18:56,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:56,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:56,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:56,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:18:59,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=695066.6666666666, ans=0.125 2023-09-30 11:19:00,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:19:01,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:01,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:19:03,618 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 11:19:03,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 11:19:08,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:19:08,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:08,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:09,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:19:09,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:11,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:19:12,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:12,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:19:14,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:15,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:19:18,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 11:19:20,112 INFO [train.py:1039] (0/4) Epoch 20, batch 3350, loss[loss=0.1966, simple_loss=0.2592, pruned_loss=0.06702, over 22784.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2527, pruned_loss=0.05068, over 4723063.47 frames. ], batch size: 322, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:19:20,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:21,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:23,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:19:23,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:19:25,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:27,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:27,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:30,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:19:31,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:33,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:19:35,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:37,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:19:38,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:39,071 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=695266.6666666666, ans=0.0 2023-09-30 11:19:40,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:19:41,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 11:19:42,006 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=695266.6666666666, ans=0.125 2023-09-30 11:19:43,312 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 11:19:44,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:48,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 11:19:48,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 11:19:48,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:19:49,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:19:50,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:50,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=695266.6666666666, ans=0.125 2023-09-30 11:19:51,284 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.68 vs. limit=15.0 2023-09-30 11:19:52,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 11:19:52,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:52,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:19:55,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:57,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:57,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:57,317 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=695333.3333333334, ans=0.125 2023-09-30 11:19:59,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:20:02,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:05,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:06,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:10,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:20:10,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:20:12,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:12,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:12,484 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=695400.0, ans=0.2 2023-09-30 11:20:15,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:15,504 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=695400.0, ans=0.125 2023-09-30 11:20:16,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 11:20:16,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:20:16,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 11:20:16,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:20:18,298 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 11:20:20,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:21,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:30,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:32,057 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 11:20:32,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:20:33,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:20:33,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:20:39,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:20:43,110 INFO [train.py:1039] (0/4) Epoch 20, batch 3400, loss[loss=0.1795, simple_loss=0.2716, pruned_loss=0.04369, over 24316.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2536, pruned_loss=0.05042, over 4728445.39 frames. ], batch size: 74, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:20:43,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 11:20:43,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:20:43,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:20:44,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:44,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 11:20:46,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:46,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 11:20:47,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:47,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:49,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:20:49,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:20:50,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 11:20:53,370 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:20:55,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 11:20:55,855 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 11:20:55,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:20:57,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=695600.0, ans=0.1 2023-09-30 11:21:00,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:21:00,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:21:02,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:04,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:21:07,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:09,385 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=695600.0, ans=0.125 2023-09-30 11:21:10,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 11:21:14,107 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.831e+02 2.000e+02 2.171e+02 2.770e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 11:21:15,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:21:18,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:18,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:20,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:21:26,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:21:30,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 11:21:37,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 11:21:37,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:21:38,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:40,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:40,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:21:41,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:47,243 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=695800.0, ans=0.0 2023-09-30 11:21:48,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:21:48,392 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:21:53,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:21:54,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 11:21:59,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:22:04,519 INFO [train.py:1039] (0/4) Epoch 20, batch 3450, loss[loss=0.1608, simple_loss=0.2368, pruned_loss=0.0424, over 23532.00 frames. ], tot_loss[loss=0.177, simple_loss=0.253, pruned_loss=0.05052, over 4725638.35 frames. ], batch size: 134, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:22:04,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 11:22:11,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 11:22:11,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:13,484 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:22:13,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 11:22:13,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=695866.6666666666, ans=0.2 2023-09-30 11:22:15,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:22:18,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:22:19,104 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.85 vs. limit=15.0 2023-09-30 11:22:20,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=695933.3333333334, ans=0.09899494936611666 2023-09-30 11:22:24,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:22:25,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:26,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:22:26,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:28,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:33,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 11:22:38,666 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=696000.0, ans=0.2 2023-09-30 11:22:39,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 11:22:41,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:22:41,420 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:22:43,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:43,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=696000.0, ans=0.125 2023-09-30 11:22:50,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 11:22:51,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:22:55,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:22:55,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:57,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:22:57,351 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=696066.6666666666, ans=0.125 2023-09-30 11:22:58,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:23:00,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 11:23:00,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:02,469 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.38 vs. limit=15.0 2023-09-30 11:23:03,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:23:06,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:07,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 11:23:11,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:23:15,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:23:16,143 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=696133.3333333334, ans=0.1 2023-09-30 11:23:19,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:20,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:25,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:25,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:23:26,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:23:26,132 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:27,412 INFO [train.py:1039] (0/4) Epoch 20, batch 3500, loss[loss=0.1598, simple_loss=0.2459, pruned_loss=0.03685, over 24497.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2517, pruned_loss=0.0497, over 4720630.55 frames. ], batch size: 66, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:23:31,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:34,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:23:34,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 11:23:37,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:23:41,012 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:23:44,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:44,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 11:23:48,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:23:48,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:48,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=696266.6666666666, ans=0.125 2023-09-30 11:23:52,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:23:52,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:23:52,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:23:53,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:53,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:23:55,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 11:23:58,756 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.890e+02 2.132e+02 2.452e+02 4.334e+02, threshold=4.264e+02, percent-clipped=1.0 2023-09-30 11:23:58,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:58,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:24:00,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:04,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:05,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 11:24:05,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:24:07,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:07,735 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=696333.3333333334, ans=0.09899494936611666 2023-09-30 11:24:10,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:24:10,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:12,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:24:12,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:14,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 11:24:14,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 11:24:15,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 11:24:15,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:16,662 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.85 vs. limit=6.0 2023-09-30 11:24:17,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:18,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:18,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:24:23,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:24:24,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:24:30,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:24:30,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 11:24:32,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 11:24:32,100 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:24:35,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:35,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:36,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:41,793 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 11:24:43,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:45,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:45,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 11:24:46,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 11:24:47,240 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=696466.6666666666, ans=0.125 2023-09-30 11:24:48,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:49,935 INFO [train.py:1039] (0/4) Epoch 20, batch 3550, loss[loss=0.1619, simple_loss=0.2384, pruned_loss=0.04269, over 24335.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2507, pruned_loss=0.04944, over 4710919.22 frames. ], batch size: 56, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:24:50,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:51,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:24:51,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:24:56,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:25:05,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:07,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:25:10,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:12,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:25:14,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:14,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:25:14,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:25:17,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:19,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:25:19,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:19,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:25:21,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:25:24,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=696666.6666666666, ans=0.1 2023-09-30 11:25:27,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:25:27,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:28,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:28,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:29,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:25:29,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 11:25:29,124 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:32,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:32,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:25:40,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:40,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:40,474 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=696733.3333333334, ans=0.0 2023-09-30 11:25:42,067 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.57 vs. limit=22.5 2023-09-30 11:25:42,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:44,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 11:25:44,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:25:45,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 11:25:47,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:48,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:25:48,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:25:54,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 11:25:56,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:25:59,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:01,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 11:26:02,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:05,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:26:07,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 11:26:12,328 INFO [train.py:1039] (0/4) Epoch 20, batch 3600, loss[loss=0.1424, simple_loss=0.2197, pruned_loss=0.03255, over 24424.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2498, pruned_loss=0.04936, over 4708786.84 frames. ], batch size: 58, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:26:14,672 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 11:26:14,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:26:16,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:26:16,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=696866.6666666666, ans=0.125 2023-09-30 11:26:17,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:17,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:19,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:26:22,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:24,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:26,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:26:26,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:26:27,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:27,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 11:26:30,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:26:32,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:34,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:35,754 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=696933.3333333334, ans=0.1 2023-09-30 11:26:38,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:40,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:26:40,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:41,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 11:26:42,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:43,316 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.766e+02 1.915e+02 2.241e+02 3.227e+02, threshold=3.831e+02, percent-clipped=0.0 2023-09-30 11:26:43,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:45,163 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:26:45,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:47,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:48,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:26:50,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 11:26:58,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:27:00,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:27:00,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 11:27:05,382 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=697066.6666666666, ans=0.0 2023-09-30 11:27:06,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:27:12,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:16,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:23,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:27:23,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:27:23,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 11:27:24,371 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=15.0 2023-09-30 11:27:25,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 11:27:26,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 11:27:28,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:27:28,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:27:31,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 11:27:31,182 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:27:31,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:27:31,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:27:32,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 11:27:34,881 INFO [train.py:1039] (0/4) Epoch 20, batch 3650, loss[loss=0.1961, simple_loss=0.2654, pruned_loss=0.06343, over 22757.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2513, pruned_loss=0.0501, over 4715802.89 frames. ], batch size: 322, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:27:34,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 11:27:35,357 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:27:36,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:38,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 11:27:44,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 11:27:47,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:27:48,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=697200.0, ans=0.0 2023-09-30 11:27:51,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 11:27:51,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=697266.6666666666, ans=0.125 2023-09-30 11:27:53,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 11:27:58,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:27:58,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:27:58,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:28:01,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:28:02,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:28:02,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 11:28:04,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:28:04,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:06,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 11:28:07,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:28:07,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:07,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:09,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:28:09,933 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.50 vs. limit=22.5 2023-09-30 11:28:12,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 11:28:13,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 11:28:15,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:28:18,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 11:28:19,834 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=697333.3333333334, ans=0.125 2023-09-30 11:28:21,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:21,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:28:26,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:28:26,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=697400.0, ans=0.0 2023-09-30 11:28:29,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:29,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:28:29,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:28:29,926 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=697400.0, ans=0.2 2023-09-30 11:28:31,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:28:31,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=697400.0, ans=0.0 2023-09-30 11:28:33,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:28:35,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:36,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:36,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:38,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:28:39,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:39,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:46,610 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 11:28:50,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:50,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:51,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:28:53,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:28:54,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:28:56,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:58,365 INFO [train.py:1039] (0/4) Epoch 20, batch 3700, loss[loss=0.177, simple_loss=0.2534, pruned_loss=0.05028, over 24291.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2515, pruned_loss=0.05034, over 4720281.60 frames. ], batch size: 61, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:28:58,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 11:28:58,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:29:00,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:29:03,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:29:03,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:29:06,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:06,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 11:29:06,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:29:08,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:29:08,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:29:10,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=697533.3333333334, ans=0.0 2023-09-30 11:29:11,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:29:16,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:29:17,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:19,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:29:19,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:20,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:29:22,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:24,418 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 11:29:31,974 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.879e+02 2.205e+02 2.626e+02 3.954e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-30 11:29:32,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:29:32,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:29:33,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:29:33,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 11:29:33,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:35,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:37,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 11:29:38,186 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.12 vs. limit=15.0 2023-09-30 11:29:38,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:40,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:29:44,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:44,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:29:45,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:29:48,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:48,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 11:29:49,787 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.94 vs. limit=15.0 2023-09-30 11:29:50,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:50,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 11:29:55,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:29:56,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:30:00,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:01,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 11:30:01,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=697733.3333333334, ans=0.125 2023-09-30 11:30:03,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:30:03,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:30:03,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:03,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:08,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:09,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 11:30:11,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 11:30:11,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:30:11,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:13,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:30:14,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:30:17,755 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.76 vs. limit=10.0 2023-09-30 11:30:18,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:30:21,082 INFO [train.py:1039] (0/4) Epoch 20, batch 3750, loss[loss=0.1704, simple_loss=0.2427, pruned_loss=0.04911, over 23700.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2527, pruned_loss=0.05042, over 4728315.37 frames. ], batch size: 149, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:30:21,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:30:21,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:30:24,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 11:30:25,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:30:27,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:30:29,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 11:30:29,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:30:31,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:33,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:34,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:30:38,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=697933.3333333334, ans=0.125 2023-09-30 11:30:40,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:43,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:30:43,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:30:46,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:49,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:30:51,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 11:30:51,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:30:52,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:30:54,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:57,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 11:31:02,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 11:31:02,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:31:04,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:31:04,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:11,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:13,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 11:31:16,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 11:31:19,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:23,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:31:23,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:31:24,240 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.65 vs. limit=15.0 2023-09-30 11:31:26,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:31:30,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:31:30,352 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=698133.3333333334, ans=0.2 2023-09-30 11:31:33,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:31:34,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:31:36,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:31:38,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:31:45,021 INFO [train.py:1039] (0/4) Epoch 20, batch 3800, loss[loss=0.1884, simple_loss=0.2527, pruned_loss=0.0621, over 23681.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2523, pruned_loss=0.05021, over 4730156.14 frames. ], batch size: 232, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:31:47,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:31:50,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:52,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:31:53,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 11:31:56,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:57,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:31:59,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:32:00,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 11:32:00,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:01,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:32:03,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:32:03,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:32:04,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:06,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 11:32:06,560 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=698266.6666666666, ans=0.05 2023-09-30 11:32:06,673 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=698266.6666666666, ans=0.0 2023-09-30 11:32:09,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 11:32:09,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:32:13,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:15,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:32:15,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:32:16,358 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.17 vs. limit=15.0 2023-09-30 11:32:18,367 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.841e+02 1.992e+02 2.236e+02 3.615e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 11:32:18,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:32:18,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:20,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:20,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:27,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:32:27,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 11:32:28,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:36,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:32:41,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:32:45,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 11:32:48,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 11:32:49,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:51,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:53,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:55,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 11:32:58,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 11:32:58,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 11:33:00,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:00,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:33:06,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:33:07,646 INFO [train.py:1039] (0/4) Epoch 20, batch 3850, loss[loss=0.1593, simple_loss=0.2401, pruned_loss=0.03924, over 24652.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2514, pruned_loss=0.04987, over 4718358.06 frames. ], batch size: 60, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:33:07,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:33:11,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=698533.3333333334, ans=0.125 2023-09-30 11:33:12,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:33:13,352 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=698533.3333333334, ans=0.0 2023-09-30 11:33:14,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 11:33:16,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:33:17,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:21,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:33:23,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:26,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:33:28,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 11:33:28,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=698600.0, ans=0.125 2023-09-30 11:33:34,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:36,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:39,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:39,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:33:41,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:41,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:33:43,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:43,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:33:43,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:45,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:45,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=698666.6666666666, ans=0.0 2023-09-30 11:33:46,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:46,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:33:48,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 11:33:48,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 11:33:49,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:49,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:53,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:33:55,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:55,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 11:33:55,798 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=698666.6666666666, ans=0.125 2023-09-30 11:33:58,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 11:34:00,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:01,245 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.90 vs. limit=15.0 2023-09-30 11:34:02,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 11:34:04,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:34:08,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:10,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:34:13,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:13,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 11:34:16,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 11:34:18,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:18,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:21,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:34:21,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:34:23,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:34:23,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 11:34:23,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:34:27,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 11:34:27,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:27,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:28,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:34:29,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:30,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:34:32,491 INFO [train.py:1039] (0/4) Epoch 20, batch 3900, loss[loss=0.1656, simple_loss=0.2327, pruned_loss=0.04919, over 23415.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.251, pruned_loss=0.04963, over 4732340.26 frames. ], batch size: 285, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:34:32,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:32,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:32,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:32,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 11:34:32,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:37,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:38,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:38,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:34:40,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:41,933 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=698866.6666666666, ans=0.0 2023-09-30 11:34:43,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:43,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:46,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:34:46,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 11:34:47,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:34:49,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 11:34:49,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:51,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 11:34:52,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 11:34:58,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:34:59,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:59,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:35:01,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:04,762 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.817e+02 2.055e+02 2.286e+02 3.490e+02, threshold=4.109e+02, percent-clipped=0.0 2023-09-30 11:35:06,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:35:08,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:35:10,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:35:10,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:11,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:35:17,146 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.46 vs. limit=15.0 2023-09-30 11:35:17,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:17,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:35:23,045 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=699066.6666666666, ans=0.07 2023-09-30 11:35:24,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:35:24,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=699066.6666666666, ans=0.04949747468305833 2023-09-30 11:35:25,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:35:37,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:35:41,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:43,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 11:35:43,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 11:35:43,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:45,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 11:35:46,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:47,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 11:35:53,935 INFO [train.py:1039] (0/4) Epoch 20, batch 3950, loss[loss=0.1544, simple_loss=0.2357, pruned_loss=0.03651, over 24443.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2506, pruned_loss=0.04964, over 4714355.75 frames. ], batch size: 63, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:35:57,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:57,167 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 11:35:58,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:36:03,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:36:04,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:36:13,265 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 11:36:14,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:14,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 11:36:14,850 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 11:36:16,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:18,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:18,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:36:18,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:22,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 11:36:25,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:36:25,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:25,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:36:25,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:36:26,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:36:36,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:36:36,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:36:41,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 11:36:47,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=699400.0, ans=0.1 2023-09-30 11:36:48,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 11:36:48,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 11:36:48,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:36:50,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:36:58,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:36:58,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:36:59,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:59,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:37:00,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 11:37:04,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:37:05,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:37:08,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 11:37:14,077 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.85 vs. limit=22.5 2023-09-30 11:37:15,282 INFO [train.py:1039] (0/4) Epoch 20, batch 4000, loss[loss=0.1656, simple_loss=0.2371, pruned_loss=0.0471, over 18660.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2517, pruned_loss=0.04969, over 4721257.29 frames. ], batch size: 40, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:37:20,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:23,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=699533.3333333334, ans=0.1 2023-09-30 11:37:27,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:33,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:33,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:37:35,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:35,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 11:37:35,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:37:35,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 11:37:35,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:37:35,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 11:37:38,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:41,989 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=699600.0, ans=0.0 2023-09-30 11:37:43,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:37:43,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:37:43,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:37:43,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:37:43,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:37:44,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:37:45,000 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 11:37:46,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:37:46,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:37:47,035 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=699666.6666666666, ans=0.1 2023-09-30 11:37:48,530 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.791e+02 1.984e+02 2.297e+02 3.398e+02, threshold=3.968e+02, percent-clipped=0.0 2023-09-30 11:37:49,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=699666.6666666666, ans=0.125 2023-09-30 11:37:50,238 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 11:37:50,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:37:50,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:37:59,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 11:37:59,317 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:38:00,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:38:02,416 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 11:38:03,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:38:04,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 11:38:04,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:05,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:05,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:38:07,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:38:08,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:38:08,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:38:10,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 11:38:11,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:13,150 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 11:38:16,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=699733.3333333334, ans=0.125 2023-09-30 11:38:19,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:38:22,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:38:25,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:38:26,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:27,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:38:27,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=699800.0, ans=0.125 2023-09-30 11:38:28,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:38:34,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:36,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:38:36,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 11:38:37,995 INFO [train.py:1039] (0/4) Epoch 20, batch 4050, loss[loss=0.2001, simple_loss=0.2643, pruned_loss=0.06793, over 23822.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2518, pruned_loss=0.04962, over 4719819.16 frames. ], batch size: 212, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:38:39,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:38:39,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:38:41,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:38:41,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:38:42,168 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.94 vs. limit=22.5 2023-09-30 11:38:42,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:46,170 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=699866.6666666666, ans=0.0 2023-09-30 11:38:47,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:50,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:38:52,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:38:55,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:38:55,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:59,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:02,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:39:03,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 11:39:07,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 11:39:07,379 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 11:39:09,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:39:14,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 11:39:15,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:20,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:21,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:22,099 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=700000.0, ans=0.125 2023-09-30 11:39:23,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:39:23,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:26,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:39:28,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 11:39:28,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:39:31,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:33,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 11:39:37,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=700066.6666666666, ans=0.125 2023-09-30 11:39:39,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:43,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=700133.3333333334, ans=0.0 2023-09-30 11:39:46,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 11:39:47,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:47,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:39:49,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 11:39:49,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 11:39:49,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:39:52,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:39:53,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:39:53,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:40:00,017 INFO [train.py:1039] (0/4) Epoch 20, batch 4100, loss[loss=0.1864, simple_loss=0.268, pruned_loss=0.05239, over 24388.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2529, pruned_loss=0.05045, over 4719054.35 frames. ], batch size: 77, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:40:01,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 11:40:03,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 11:40:03,512 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=700200.0, ans=0.035 2023-09-30 11:40:05,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 11:40:07,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 11:40:07,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:09,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:40:10,782 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 11:40:14,407 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:14,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:40:14,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:16,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:40:16,230 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=700266.6666666666, ans=10.0 2023-09-30 11:40:21,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:40:21,892 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.08 vs. limit=22.5 2023-09-30 11:40:22,551 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:22,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:40:22,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 11:40:24,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:24,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:40:25,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:25,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:40:25,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 11:40:28,765 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:30,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 11:40:31,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:40:34,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:34,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 11:40:36,118 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.811e+02 2.046e+02 2.303e+02 3.809e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-30 11:40:36,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:40:37,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:40:37,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:40:41,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 11:40:43,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:40:43,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:40:46,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 11:40:48,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:48,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:40:50,619 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=700400.0, ans=0.125 2023-09-30 11:40:51,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:58,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:02,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:04,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:41:12,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=700466.6666666666, ans=0.2 2023-09-30 11:41:13,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:13,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:41:15,912 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.49 vs. limit=15.0 2023-09-30 11:41:18,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:19,169 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.53 vs. limit=15.0 2023-09-30 11:41:20,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:41:22,556 INFO [train.py:1039] (0/4) Epoch 20, batch 4150, loss[loss=0.1581, simple_loss=0.2325, pruned_loss=0.04183, over 24464.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.253, pruned_loss=0.05016, over 4719142.19 frames. ], batch size: 58, lr: 5.11e-03, grad_scale: 4.0 2023-09-30 11:41:25,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:41:27,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:41:27,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:41:27,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:30,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 11:41:30,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:32,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 11:41:32,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 11:41:32,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 11:41:34,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:40,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:41:40,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:44,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:41:45,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:41:46,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:41:48,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:41:48,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:50,111 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:41:55,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:58,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=700666.6666666666, ans=0.125 2023-09-30 11:41:59,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:01,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 11:42:02,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 11:42:02,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:42:04,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 11:42:04,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:42:04,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:04,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=700666.6666666666, ans=0.0 2023-09-30 11:42:05,290 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.00 vs. limit=6.0 2023-09-30 11:42:09,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:11,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:15,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 11:42:18,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:20,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:42:20,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 11:42:20,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:23,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 11:42:25,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:42:26,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:26,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:28,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 11:42:28,249 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:42:29,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:42:31,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:42:34,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 11:42:34,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:34,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:42:34,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:42:36,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 11:42:36,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:36,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:42:36,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:42:39,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:39,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 11:42:41,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:43,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=700866.6666666666, ans=0.125 2023-09-30 11:42:44,278 INFO [train.py:1039] (0/4) Epoch 20, batch 4200, loss[loss=0.1716, simple_loss=0.2414, pruned_loss=0.05085, over 23513.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2527, pruned_loss=0.0504, over 4727652.83 frames. ], batch size: 134, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:42:46,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:42:46,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 11:42:47,064 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=700866.6666666666, ans=0.125 2023-09-30 11:42:49,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:42:51,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:42:54,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:42:54,568 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:54,571 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:57,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 11:42:59,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 11:43:00,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:02,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:06,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:43:09,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:43:11,154 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:12,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:12,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 11:43:12,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:12,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:14,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:43:14,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:43:16,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:43:16,961 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.34 vs. limit=15.0 2023-09-30 11:43:17,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 11:43:19,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:20,933 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.891e+02 2.083e+02 2.448e+02 3.727e+02, threshold=4.165e+02, percent-clipped=0.0 2023-09-30 11:43:25,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:43:27,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:43:28,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:43:28,942 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=701000.0, ans=0.0 2023-09-30 11:43:30,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:43:31,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:43:32,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 11:43:32,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:33,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:43:40,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:43:40,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:44,173 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.52 vs. limit=15.0 2023-09-30 11:43:46,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:43:46,970 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=701066.6666666666, ans=0.0 2023-09-30 11:43:49,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 11:43:51,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:56,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:43:58,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:43:58,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=701133.3333333334, ans=0.1 2023-09-30 11:44:00,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 11:44:04,250 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:44:05,347 INFO [train.py:1039] (0/4) Epoch 20, batch 4250, loss[loss=0.1847, simple_loss=0.255, pruned_loss=0.05714, over 23280.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2514, pruned_loss=0.05005, over 4731318.87 frames. ], batch size: 119, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:44:06,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:44:07,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=701200.0, ans=0.125 2023-09-30 11:44:11,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:44:11,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:44:13,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:16,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=701200.0, ans=0.125 2023-09-30 11:44:17,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:44:19,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 11:44:19,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:44:23,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:27,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:29,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:30,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:32,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:44:32,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:44:33,848 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=701266.6666666666, ans=0.125 2023-09-30 11:44:35,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:37,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:37,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:40,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:44:41,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:44:43,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 11:44:48,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 11:44:48,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:49,213 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.16 vs. limit=15.0 2023-09-30 11:44:50,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:50,104 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:51,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:44:51,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:51,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:54,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:44:54,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:44:59,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:01,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:03,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 11:45:03,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:45:03,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 11:45:04,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:45:06,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:45:07,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:07,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:45:09,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 11:45:11,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:45:12,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:45:18,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:18,918 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=701466.6666666666, ans=0.125 2023-09-30 11:45:21,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:23,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:45:23,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:25,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:26,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:45:27,986 INFO [train.py:1039] (0/4) Epoch 20, batch 4300, loss[loss=0.1773, simple_loss=0.2524, pruned_loss=0.05111, over 23427.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2508, pruned_loss=0.04996, over 4730773.81 frames. ], batch size: 105, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:45:28,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:45:28,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 11:45:29,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:33,110 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=701533.3333333334, ans=0.0 2023-09-30 11:45:34,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:34,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:45:36,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=701533.3333333334, ans=0.0 2023-09-30 11:45:39,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:44,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:44,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 11:45:44,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:45:48,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:45:48,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:45:48,596 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 11:45:51,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:45:54,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:45:57,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 11:45:57,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:45:57,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 11:46:00,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:46:03,845 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.784e+02 1.943e+02 2.161e+02 2.799e+02, threshold=3.885e+02, percent-clipped=0.0 2023-09-30 11:46:03,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:46:05,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:46:05,714 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:46:06,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=701666.6666666666, ans=0.125 2023-09-30 11:46:07,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:46:08,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:10,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:46:10,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 11:46:11,133 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=701666.6666666666, ans=0.125 2023-09-30 11:46:12,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 11:46:15,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:46:19,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:19,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:46:20,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:20,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:20,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 11:46:20,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 11:46:20,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 11:46:20,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=701733.3333333334, ans=0.1 2023-09-30 11:46:22,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:46:22,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 11:46:24,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 11:46:24,549 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=701733.3333333334, ans=0.0 2023-09-30 11:46:27,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:29,442 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 11:46:30,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:46:32,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:32,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:35,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 11:46:35,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:46:35,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:37,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:46:37,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:37,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:46:40,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:46:43,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:44,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:44,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:49,811 INFO [train.py:1039] (0/4) Epoch 20, batch 4350, loss[loss=0.1581, simple_loss=0.2407, pruned_loss=0.0377, over 24683.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2517, pruned_loss=0.0499, over 4729438.91 frames. ], batch size: 65, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:46:51,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 11:46:51,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:46:57,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:00,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:04,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:47:04,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:47:07,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=701933.3333333334, ans=0.04949747468305833 2023-09-30 11:47:08,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:47:13,287 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:15,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:47:15,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:18,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=701933.3333333334, ans=0.0 2023-09-30 11:47:19,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:47:21,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:47:22,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:47:30,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 11:47:30,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:30,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=702000.0, ans=0.125 2023-09-30 11:47:31,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:34,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=702000.0, ans=0.0 2023-09-30 11:47:36,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:39,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 11:47:42,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:44,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:47:47,399 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 11:47:48,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:48,927 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:47:50,396 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 11:47:51,841 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 11:47:51,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:51,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:53,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:47:54,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:56,512 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:56,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:58,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 11:47:58,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:58,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:59,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:00,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 11:48:01,934 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 11:48:01,941 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 11:48:01,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 11:48:07,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:48:09,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:48:09,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:09,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:48:10,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 11:48:13,803 INFO [train.py:1039] (0/4) Epoch 20, batch 4400, loss[loss=0.1863, simple_loss=0.2711, pruned_loss=0.05079, over 24393.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.253, pruned_loss=0.05039, over 4732244.18 frames. ], batch size: 77, lr: 5.10e-03, grad_scale: 16.0 2023-09-30 11:48:13,954 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 11:48:13,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:17,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=702200.0, ans=0.125 2023-09-30 11:48:18,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:18,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:20,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:48:23,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 11:48:23,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 11:48:23,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 11:48:23,383 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 11:48:24,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:48:24,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:27,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 11:48:29,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:31,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:31,051 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 11:48:34,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:34,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 11:48:34,965 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 11:48:36,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 11:48:38,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 11:48:38,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 11:48:39,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:41,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:42,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:42,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:48:45,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 11:48:45,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 11:48:47,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:49,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:48:49,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:49,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:50,488 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.844e+02 2.019e+02 2.293e+02 3.220e+02, threshold=4.037e+02, percent-clipped=0.0 2023-09-30 11:48:50,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:50,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 11:48:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 11:48:55,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:01,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:49:04,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 11:49:09,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:49:13,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:14,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:49:14,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 11:49:16,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:49:16,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:49:16,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:49:18,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:49:23,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 11:49:27,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 11:49:28,033 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=702466.6666666666, ans=0.2 2023-09-30 11:49:29,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 11:49:29,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:49:29,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 11:49:30,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:49:33,868 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:49:34,309 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:49:35,358 INFO [train.py:1039] (0/4) Epoch 20, batch 4450, loss[loss=0.2126, simple_loss=0.2781, pruned_loss=0.07359, over 23521.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2541, pruned_loss=0.05116, over 4718224.22 frames. ], batch size: 256, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:49:35,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 11:49:40,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:43,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:43,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:49:50,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:49:50,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:49:54,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:54,753 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.71 vs. limit=15.0 2023-09-30 11:49:57,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:50:01,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:50:01,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:01,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 11:50:01,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:03,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:03,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:03,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:50:06,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:50:09,018 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=702666.6666666666, ans=0.0 2023-09-30 11:50:10,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:10,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:11,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:11,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:13,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:50:14,146 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.74 vs. limit=22.5 2023-09-30 11:50:16,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:50:18,536 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 11:50:18,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 11:50:18,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:50:22,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:22,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 11:50:29,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:50:32,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:33,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 11:50:33,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:33,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:33,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:50:33,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:35,767 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=702733.3333333334, ans=0.125 2023-09-30 11:50:36,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:41,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:50:41,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 11:50:41,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=702800.0, ans=0.1 2023-09-30 11:50:43,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:50:44,717 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:46,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:47,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:47,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:50:50,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:50:52,985 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=702800.0, ans=0.0 2023-09-30 11:50:54,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 11:50:54,584 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=702800.0, ans=0.125 2023-09-30 11:50:55,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:50:57,869 INFO [train.py:1039] (0/4) Epoch 20, batch 4500, loss[loss=0.1729, simple_loss=0.2399, pruned_loss=0.05297, over 23787.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2539, pruned_loss=0.05096, over 4711773.73 frames. ], batch size: 195, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:51:03,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:04,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 11:51:04,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 11:51:05,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:05,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=702866.6666666666, ans=0.1 2023-09-30 11:51:10,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:51:10,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:11,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:51:11,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:51:11,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:13,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:13,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=702933.3333333334, ans=0.1 2023-09-30 11:51:25,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:25,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:51:31,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:31,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:51:33,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:51:36,925 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.958e+02 2.176e+02 2.653e+02 3.969e+02, threshold=4.352e+02, percent-clipped=0.0 2023-09-30 11:51:38,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:51:42,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:51:45,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:51:48,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:51:48,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 11:51:48,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:51:49,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:50,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:51,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:55,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:55,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 11:51:55,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:51:55,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:00,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:52:00,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:52:06,610 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:06,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:52:08,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:52:09,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 11:52:12,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 11:52:12,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 11:52:15,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 11:52:16,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 11:52:17,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:20,725 INFO [train.py:1039] (0/4) Epoch 20, batch 4550, loss[loss=0.1771, simple_loss=0.2213, pruned_loss=0.06647, over 19325.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2527, pruned_loss=0.05038, over 4707055.43 frames. ], batch size: 388, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:52:22,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:22,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:24,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:28,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:52:30,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:52:34,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:52:34,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:52:34,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:37,964 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:38,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:41,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:52:44,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 11:52:45,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 11:52:47,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:52:48,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 11:52:51,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 11:52:53,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:52:56,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 11:52:58,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:53:02,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:53:05,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 11:53:09,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:11,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:11,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:53:13,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:14,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 11:53:14,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 11:53:14,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:53:16,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 11:53:19,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 11:53:19,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:20,703 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:20,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:23,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:23,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:53:25,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:53:25,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 11:53:26,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:26,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:53:26,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 11:53:26,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=703466.6666666666, ans=0.125 2023-09-30 11:53:28,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:53:28,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 11:53:31,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:53:31,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:53:35,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:53:35,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:35,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:53:38,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:53:38,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:53:40,330 INFO [train.py:1039] (0/4) Epoch 20, batch 4600, loss[loss=0.176, simple_loss=0.2501, pruned_loss=0.051, over 22908.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.251, pruned_loss=0.04993, over 4705162.94 frames. ], batch size: 50, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:53:40,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:42,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:46,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:53:47,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:53:48,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:49,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 11:53:50,220 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=703533.3333333334, ans=0.1 2023-09-30 11:53:51,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:53:51,770 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=703533.3333333334, ans=0.025 2023-09-30 11:53:56,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:53:56,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:57,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:59,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=703600.0, ans=0.125 2023-09-30 11:54:03,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=703600.0, ans=0.125 2023-09-30 11:54:04,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 11:54:04,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=703600.0, ans=0.125 2023-09-30 11:54:05,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:08,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:11,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:54:11,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:13,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=703666.6666666666, ans=0.1 2023-09-30 11:54:18,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 11:54:18,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:54:19,796 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.438e+02 1.811e+02 2.166e+02 2.771e+02 4.574e+02, threshold=4.333e+02, percent-clipped=1.0 2023-09-30 11:54:19,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:54:20,733 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.14 vs. limit=6.0 2023-09-30 11:54:27,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:27,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:54:29,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:54:33,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 11:54:35,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:54:39,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:40,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=703733.3333333334, ans=0.125 2023-09-30 11:54:41,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:54:42,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:42,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 11:54:42,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:44,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 11:54:44,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:45,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:46,245 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=703800.0, ans=0.2 2023-09-30 11:54:47,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:47,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:48,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:49,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 11:54:51,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 11:54:51,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 11:54:51,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:53,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:54:53,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:55,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:55:02,984 INFO [train.py:1039] (0/4) Epoch 20, batch 4650, loss[loss=0.1699, simple_loss=0.259, pruned_loss=0.04046, over 24309.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2515, pruned_loss=0.04968, over 4724023.83 frames. ], batch size: 74, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:55:03,315 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=703866.6666666666, ans=0.1 2023-09-30 11:55:06,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:55:09,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:09,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:09,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:55:09,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:55:09,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:12,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:15,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 11:55:17,471 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=703933.3333333334, ans=0.0 2023-09-30 11:55:18,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:55:20,291 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 11:55:21,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:21,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 11:55:23,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:55:23,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 11:55:23,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 11:55:25,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:26,784 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:55:28,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:55:30,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:30,306 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 11:55:34,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:34,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=704000.0, ans=0.125 2023-09-30 11:55:35,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 11:55:39,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:39,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:55:39,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 11:55:40,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:55:43,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:55:46,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:52,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:54,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:56,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:56,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:56:01,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 11:56:01,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 11:56:01,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 11:56:01,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 11:56:03,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:10,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:56:10,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:10,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 11:56:10,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:12,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:13,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:56:14,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:56:17,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:56:17,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:17,925 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=704133.3333333334, ans=0.125 2023-09-30 11:56:19,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:56:22,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:22,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:56:22,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:56:22,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 11:56:23,862 INFO [train.py:1039] (0/4) Epoch 20, batch 4700, loss[loss=0.2009, simple_loss=0.2689, pruned_loss=0.06647, over 22837.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2519, pruned_loss=0.04975, over 4733383.99 frames. ], batch size: 322, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:56:24,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:56:27,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 11:56:36,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:36,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:37,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:56:39,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:41,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:56:44,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 11:56:44,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 11:56:48,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:48,734 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:56:50,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:56:54,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:57:00,545 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.959e+02 2.502e+02 2.880e+02 4.077e+02, threshold=5.005e+02, percent-clipped=0.0 2023-09-30 11:57:00,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:57:02,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:57:05,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:57:11,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 11:57:11,568 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:57:12,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:57:16,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:19,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 11:57:19,785 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=704400.0, ans=0.07 2023-09-30 11:57:21,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:57:24,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:57:24,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 11:57:27,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:27,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:30,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:57:30,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:57:30,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 11:57:31,907 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 11:57:32,177 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=704466.6666666666, ans=0.0 2023-09-30 11:57:33,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:35,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:35,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:36,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 11:57:36,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:42,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 11:57:45,987 INFO [train.py:1039] (0/4) Epoch 20, batch 4750, loss[loss=0.1959, simple_loss=0.2793, pruned_loss=0.05621, over 24556.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2527, pruned_loss=0.05021, over 4733025.11 frames. ], batch size: 71, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:57:46,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:57:46,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:48,957 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.70 vs. limit=22.5 2023-09-30 11:57:51,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:51,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:57:55,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 11:57:55,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:00,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 11:58:01,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:58:01,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:03,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:05,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=704600.0, ans=0.1 2023-09-30 11:58:06,647 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=704600.0, ans=0.015 2023-09-30 11:58:09,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 11:58:14,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:58:16,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 11:58:16,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:19,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:21,470 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 11:58:21,475 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 11:58:24,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 11:58:28,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:28,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:31,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:58:31,573 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 11:58:32,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:34,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:58:37,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:58:39,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 11:58:39,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 11:58:39,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:39,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=704733.3333333334, ans=0.0 2023-09-30 11:58:40,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:58:40,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:41,667 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=704733.3333333334, ans=15.0 2023-09-30 11:58:42,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:58:43,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 11:58:48,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 11:58:50,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:58:53,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:53,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 11:58:53,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:56,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:58,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:58:58,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:59,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:59:03,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:03,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 11:59:03,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 11:59:05,253 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 11:59:08,053 INFO [train.py:1039] (0/4) Epoch 20, batch 4800, loss[loss=0.2091, simple_loss=0.2757, pruned_loss=0.07129, over 23643.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2532, pruned_loss=0.05039, over 4744422.15 frames. ], batch size: 256, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 11:59:08,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:59:09,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:09,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 11:59:15,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:17,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:23,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:59:23,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:24,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:26,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 11:59:26,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:59:26,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:59:28,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:59:32,535 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.82 vs. limit=15.0 2023-09-30 11:59:33,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:59:34,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:34,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:59:35,159 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=704933.3333333334, ans=0.1 2023-09-30 11:59:36,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:36,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:59:36,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:38,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:38,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=704933.3333333334, ans=0.0 2023-09-30 11:59:41,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:42,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:44,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:44,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:59:45,726 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.894e+02 2.052e+02 2.398e+02 3.146e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 11:59:45,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:59:46,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:48,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 11:59:48,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 11:59:50,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:50,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:59:51,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:59:51,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:51,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:59:53,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:59:55,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:59,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:04,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:05,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:12,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 12:00:12,467 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:12,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:12,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:00:12,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=705133.3333333334, ans=0.1 2023-09-30 12:00:14,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:15,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=705133.3333333334, ans=0.125 2023-09-30 12:00:17,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=705133.3333333334, ans=0.125 2023-09-30 12:00:18,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:19,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=705133.3333333334, ans=0.125 2023-09-30 12:00:20,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:00:20,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:21,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:00:21,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:00:23,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:00:26,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:26,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:26,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:27,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 12:00:29,406 INFO [train.py:1039] (0/4) Epoch 20, batch 4850, loss[loss=0.1663, simple_loss=0.2493, pruned_loss=0.04163, over 24410.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2531, pruned_loss=0.05082, over 4741165.88 frames. ], batch size: 66, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:00:29,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 12:00:29,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:29,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:32,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=705200.0, ans=0.2 2023-09-30 12:00:33,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:00:33,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:35,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:39,378 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=705200.0, ans=0.125 2023-09-30 12:00:45,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 12:00:47,180 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:48,326 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.04 vs. limit=22.5 2023-09-30 12:00:51,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:00:51,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:00:52,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:55,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:56,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=705266.6666666666, ans=0.125 2023-09-30 12:00:57,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:00:59,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:00:59,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 12:01:01,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:01:03,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=705333.3333333334, ans=0.2 2023-09-30 12:01:03,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=705333.3333333334, ans=0.0 2023-09-30 12:01:05,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:01:06,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:01:06,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:01:06,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 12:01:10,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:01:10,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:12,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=705333.3333333334, ans=0.125 2023-09-30 12:01:14,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:15,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 12:01:15,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 12:01:16,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:01:23,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:01:23,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 12:01:25,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:01:25,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:01:26,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:01:26,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 12:01:26,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:28,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 12:01:28,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:29,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:31,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 12:01:36,396 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=705466.6666666666, ans=0.1 2023-09-30 12:01:38,742 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:01:39,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:45,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:01:45,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:01:49,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 12:01:49,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:01:53,021 INFO [train.py:1039] (0/4) Epoch 20, batch 4900, loss[loss=0.153, simple_loss=0.231, pruned_loss=0.03751, over 24362.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2523, pruned_loss=0.05062, over 4737604.59 frames. ], batch size: 56, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:01:56,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:58,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:58,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:02:02,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 12:02:08,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 12:02:12,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 12:02:13,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 12:02:13,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:13,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:02:15,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:02:15,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:15,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:02:15,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 12:02:17,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=705600.0, ans=0.125 2023-09-30 12:02:20,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 12:02:20,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:02:23,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:02:25,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:28,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:02:28,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:29,516 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.910e+02 2.109e+02 2.496e+02 4.455e+02, threshold=4.218e+02, percent-clipped=1.0 2023-09-30 12:02:31,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:31,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 12:02:33,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:02:33,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:33,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 12:02:33,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 12:02:35,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=705666.6666666666, ans=0.1 2023-09-30 12:02:39,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 12:02:41,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:02:42,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:02:42,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:02:42,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=705733.3333333334, ans=0.1 2023-09-30 12:02:44,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:44,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:02:44,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:02:44,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 12:02:44,473 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=705733.3333333334, ans=0.125 2023-09-30 12:02:49,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:51,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:02:52,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:02:56,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 12:02:56,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:02:56,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:02:56,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 12:03:03,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:05,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:06,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 12:03:06,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:06,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:03:09,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:14,420 INFO [train.py:1039] (0/4) Epoch 20, batch 4950, loss[loss=0.1593, simple_loss=0.2343, pruned_loss=0.04213, over 24437.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.251, pruned_loss=0.05035, over 4742663.90 frames. ], batch size: 58, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:03:14,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:14,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:03:15,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:15,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 12:03:17,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:03:21,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:21,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:24,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 12:03:24,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 12:03:24,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:03:24,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 12:03:26,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:26,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:03:26,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:03:26,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:29,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:29,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:03:31,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:03:32,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:34,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:34,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:37,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:03:39,434 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.86 vs. limit=15.0 2023-09-30 12:03:43,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:46,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:47,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:48,420 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.81 vs. limit=10.0 2023-09-30 12:03:49,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:50,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:03:52,664 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 12:03:52,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 12:03:56,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:58,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:03:58,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:03:59,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:03:59,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:04:01,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:04:04,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:06,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:04:09,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:04:10,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:10,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:12,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 12:04:12,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:04:12,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:04:15,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=706066.6666666666, ans=0.1 2023-09-30 12:04:17,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:04:19,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:04:19,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:04:20,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:20,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:04:22,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:04:22,361 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=706133.3333333334, ans=0.0 2023-09-30 12:04:23,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:04:23,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:04:23,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:25,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 12:04:31,769 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.28 vs. limit=15.0 2023-09-30 12:04:34,051 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:37,612 INFO [train.py:1039] (0/4) Epoch 20, batch 5000, loss[loss=0.1779, simple_loss=0.2515, pruned_loss=0.05212, over 23681.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2505, pruned_loss=0.04943, over 4747056.42 frames. ], batch size: 149, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:04:38,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=706200.0, ans=0.0 2023-09-30 12:04:39,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 12:04:39,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:04:47,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:47,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:04:47,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 12:04:47,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=706200.0, ans=0.125 2023-09-30 12:04:48,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 12:04:49,576 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.39 vs. limit=15.0 2023-09-30 12:04:50,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:04:52,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 12:04:53,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:04:54,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:04:54,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 12:04:55,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:55,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:04:57,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 12:04:57,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:57,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:04:58,124 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.87 vs. limit=12.0 2023-09-30 12:04:58,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 12:05:00,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 12:05:01,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:05:02,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 12:05:02,080 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:05:02,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:03,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:05:03,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 12:05:03,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 12:05:03,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 12:05:04,612 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.36 vs. limit=15.0 2023-09-30 12:05:05,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:06,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:07,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 12:05:09,570 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:05:11,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:11,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:05:12,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:05:13,382 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=5.627e-03 2023-09-30 12:05:14,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 12:05:14,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:05:15,914 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.770e+02 2.000e+02 2.327e+02 3.746e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 12:05:16,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:05:19,370 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 12:05:22,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:05:23,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=706333.3333333334, ans=0.125 2023-09-30 12:05:24,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:24,540 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:29,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 12:05:29,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:29,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:30,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:05:32,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 12:05:32,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:35,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:37,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:05:43,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 12:05:43,566 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=706466.6666666666, ans=0.1 2023-09-30 12:05:47,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:48,366 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=706466.6666666666, ans=0.125 2023-09-30 12:05:51,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=706466.6666666666, ans=0.05 2023-09-30 12:05:58,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:59,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:00,998 INFO [train.py:1039] (0/4) Epoch 20, batch 5050, loss[loss=0.1786, simple_loss=0.2619, pruned_loss=0.04764, over 24475.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2506, pruned_loss=0.04956, over 4728612.88 frames. ], batch size: 66, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:06:01,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:06:01,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:01,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:06:01,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:06:01,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:06,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:06,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 12:06:07,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:06:09,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:11,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:06:12,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 12:06:15,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:15,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:06:16,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:06:17,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=706600.0, ans=0.0 2023-09-30 12:06:18,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:06:19,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:06:20,002 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=706600.0, ans=0.125 2023-09-30 12:06:23,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=706600.0, ans=0.125 2023-09-30 12:06:29,051 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=706600.0, ans=0.0 2023-09-30 12:06:29,341 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.50 vs. limit=15.0 2023-09-30 12:06:30,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 12:06:30,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:06:32,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:32,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 12:06:34,375 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:06:36,472 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.76 vs. limit=15.0 2023-09-30 12:06:37,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:37,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:37,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:06:37,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 12:06:38,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 12:06:39,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:40,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:06:43,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:45,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 12:06:47,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:06:49,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 12:06:51,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:06:53,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:06:53,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:06:53,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:54,249 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.64 vs. limit=22.5 2023-09-30 12:06:55,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:06:58,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:06:58,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:59,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:59,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:06:59,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 12:06:59,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:07:01,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:07:05,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:07:05,964 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 12:07:05,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:07:06,180 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=706800.0, ans=0.125 2023-09-30 12:07:07,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:09,507 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:09,561 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 12:07:13,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:13,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 12:07:13,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:14,904 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.01 vs. limit=15.0 2023-09-30 12:07:17,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:17,434 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=706800.0, ans=0.0 2023-09-30 12:07:18,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:18,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 12:07:20,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 12:07:24,214 INFO [train.py:1039] (0/4) Epoch 20, batch 5100, loss[loss=0.1654, simple_loss=0.2462, pruned_loss=0.04234, over 24486.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2524, pruned_loss=0.0503, over 4721793.39 frames. ], batch size: 66, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:07:24,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:24,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:24,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:07:27,894 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 12:07:30,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:33,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 12:07:33,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 12:07:35,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:36,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:07:40,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:40,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 12:07:40,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 12:07:42,657 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:07:46,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:46,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:07:51,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:54,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 12:07:54,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:58,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:58,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 12:08:00,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,551 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.819e+02 2.005e+02 2.241e+02 3.147e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 12:08:01,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 12:08:03,921 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 12:08:05,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:05,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 12:08:05,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 12:08:06,072 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.77 vs. limit=15.0 2023-09-30 12:08:10,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:08:16,594 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=707066.6666666666, ans=0.0 2023-09-30 12:08:18,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:18,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=707066.6666666666, ans=0.125 2023-09-30 12:08:20,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 12:08:21,556 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 12:08:21,580 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 12:08:23,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 12:08:23,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:26,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 12:08:29,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 12:08:33,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 12:08:35,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:08:35,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=707133.3333333334, ans=0.05 2023-09-30 12:08:36,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 12:08:40,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:08:40,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 12:08:45,241 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.78 vs. limit=10.0 2023-09-30 12:08:46,128 INFO [train.py:1039] (0/4) Epoch 20, batch 5150, loss[loss=0.1767, simple_loss=0.2574, pruned_loss=0.048, over 24031.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2533, pruned_loss=0.05085, over 4714476.00 frames. ], batch size: 80, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:08:46,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:08:46,895 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.84 vs. limit=15.0 2023-09-30 12:08:47,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:08:47,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:08:47,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:08:47,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:08:48,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=707200.0, ans=0.5 2023-09-30 12:08:49,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:08:51,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 12:08:51,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 12:08:51,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 12:08:51,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:08:51,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 12:08:53,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:53,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:08:55,678 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.77 vs. limit=15.0 2023-09-30 12:08:56,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:08:57,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:09:02,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:09:02,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 12:09:04,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:04,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:09:07,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:09:07,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:07,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:07,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:09:07,881 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:09:10,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 12:09:10,932 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.67 vs. limit=15.0 2023-09-30 12:09:11,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:09:11,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:15,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:09:16,990 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 12:09:18,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:09:21,214 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.15 vs. limit=15.0 2023-09-30 12:09:23,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:09:24,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 12:09:25,077 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=707333.3333333334, ans=0.2 2023-09-30 12:09:29,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:31,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=707333.3333333334, ans=0.1 2023-09-30 12:09:36,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:38,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:42,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:44,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:47,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 12:09:49,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:09:51,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:09:51,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:56,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:56,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:59,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 12:10:03,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:04,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:10:05,138 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.33 vs. limit=15.0 2023-09-30 12:10:07,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:10:07,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:10:07,976 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=707533.3333333334, ans=0.0 2023-09-30 12:10:08,550 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.12 vs. limit=6.0 2023-09-30 12:10:09,493 INFO [train.py:1039] (0/4) Epoch 20, batch 5200, loss[loss=0.1697, simple_loss=0.2305, pruned_loss=0.05445, over 22741.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2535, pruned_loss=0.0506, over 4719083.40 frames. ], batch size: 322, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:10:09,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:10:09,866 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=707533.3333333334, ans=0.0 2023-09-30 12:10:11,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:10:11,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:10:11,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:10:14,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:10:16,514 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.82 vs. limit=15.0 2023-09-30 12:10:17,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:10:18,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:23,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 12:10:24,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:10:24,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=707600.0, ans=0.1 2023-09-30 12:10:25,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:29,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:30,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:10:30,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:31,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 12:10:33,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:10:33,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:36,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 12:10:38,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:10:38,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:10:40,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 12:10:40,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 12:10:41,283 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=707666.6666666666, ans=0.0 2023-09-30 12:10:44,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 12:10:46,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:46,090 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 12:10:46,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:47,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:47,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:10:48,963 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.872e+02 2.079e+02 2.395e+02 3.722e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-30 12:10:49,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 12:10:50,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:10:52,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:54,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 12:10:55,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 12:10:55,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 12:10:56,366 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=707666.6666666666, ans=0.0 2023-09-30 12:11:01,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 12:11:01,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:11:01,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=707733.3333333334, ans=0.125 2023-09-30 12:11:06,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:11:06,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:07,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 12:11:08,457 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.22 vs. limit=6.0 2023-09-30 12:11:09,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:11:09,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:11:09,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:10,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:11,476 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=707733.3333333334, ans=22.5 2023-09-30 12:11:12,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=707733.3333333334, ans=0.09899494936611666 2023-09-30 12:11:13,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:13,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:11:18,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:11:19,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:19,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:25,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:25,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 12:11:26,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:26,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:11:30,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:30,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:11:32,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:11:33,428 INFO [train.py:1039] (0/4) Epoch 20, batch 5250, loss[loss=0.1735, simple_loss=0.2211, pruned_loss=0.0629, over 19383.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.253, pruned_loss=0.05015, over 4720079.44 frames. ], batch size: 388, lr: 5.08e-03, grad_scale: 16.0 2023-09-30 12:11:34,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:11:38,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:38,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:11:39,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:11:47,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:49,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:11:50,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:11:53,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:54,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 12:11:54,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:56,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:12:10,292 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:12:47,655 INFO [train.py:1039] (0/4) Epoch 20, batch 5300, loss[loss=0.1534, simple_loss=0.214, pruned_loss=0.04635, over 23532.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2517, pruned_loss=0.05045, over 4707884.69 frames. ], batch size: 256, lr: 5.08e-03, grad_scale: 8.0 2023-09-30 12:13:02,581 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-20.pt 2023-09-30 12:13:05,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:13:05,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 12:13:05,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 12:13:05,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:06,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:06,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:06,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:06,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:06,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:06,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:06,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:13:06,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:13:07,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 12:13:07,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 12:13:07,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 12:13:07,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:13:07,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 12:13:07,559 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 12:13:07,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:08,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:08,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:08,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:08,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:13:09,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:09,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:09,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:09,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:09,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:09,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:13:09,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:09,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:13:10,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 12:13:10,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:11,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:11,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 12:13:11,237 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 12:13:11,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:13:11,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:11,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 12:13:11,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 12:13:11,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:12,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:13:13,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:13,305 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 12:13:13,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 12:13:13,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:13:13,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:13,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 12:13:13,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 12:13:13,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 12:13:14,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:17,132 INFO [train.py:1039] (0/4) Epoch 21, batch 0, loss[loss=0.1581, simple_loss=0.2311, pruned_loss=0.04259, over 24307.00 frames. ], tot_loss[loss=0.1581, simple_loss=0.2311, pruned_loss=0.04259, over 24307.00 frames. ], batch size: 56, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:13:17,133 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 12:13:30,282 INFO [train.py:1071] (0/4) Epoch 21, validation: loss=0.2775, simple_loss=0.2715, pruned_loss=0.1418, over 1125622.00 frames. 2023-09-30 12:13:30,283 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20954MB 2023-09-30 12:13:34,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 12:13:34,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:13:37,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:13:39,714 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.71 vs. limit=15.0 2023-09-30 12:13:40,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:41,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:13:42,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:42,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 12:13:43,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 12:13:47,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:47,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:49,467 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=708346.6666666666, ans=0.125 2023-09-30 12:13:50,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:50,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:50,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:13:52,057 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.851e+02 2.011e+02 2.315e+02 3.678e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:13:52,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:13:53,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 12:13:56,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:14:05,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:14:05,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:07,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 12:14:12,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:14:12,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:14:15,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:19,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:14:21,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=708480.0, ans=0.1 2023-09-30 12:14:22,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:27,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 12:14:29,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=708480.0, ans=0.125 2023-09-30 12:14:30,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 12:14:30,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:30,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:32,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:14:32,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:33,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 12:14:37,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:37,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:42,720 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:14:43,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=708546.6666666666, ans=0.04949747468305833 2023-09-30 12:14:44,727 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=708546.6666666666, ans=0.09899494936611666 2023-09-30 12:14:45,900 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 12:14:47,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:14:51,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:14:52,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:53,945 INFO [train.py:1039] (0/4) Epoch 21, batch 50, loss[loss=0.1849, simple_loss=0.2664, pruned_loss=0.05168, over 24064.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2523, pruned_loss=0.04971, over 1063284.10 frames. ], batch size: 80, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:14:54,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 12:14:54,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:14:54,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:14:57,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:14:57,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:00,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:15:00,592 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=708613.3333333334, ans=0.0 2023-09-30 12:15:04,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 12:15:04,839 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:11,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:15:13,489 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 12:15:15,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 12:15:17,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:15:18,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:18,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:19,127 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=708680.0, ans=0.0 2023-09-30 12:15:20,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:15:20,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:15:21,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:15:21,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:27,640 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=708746.6666666666, ans=0.125 2023-09-30 12:15:33,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:34,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:15:34,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:15:35,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=708746.6666666666, ans=0.125 2023-09-30 12:15:36,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 12:15:37,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:15:39,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:15:39,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 12:15:40,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:15:42,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 12:15:44,422 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=708813.3333333334, ans=0.125 2023-09-30 12:15:49,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:15:49,451 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:50,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:51,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:51,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:15:56,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 12:15:56,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 12:15:59,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:59,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:16:00,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:16:02,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:16:02,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 12:16:04,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 12:16:04,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:16:05,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:07,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:16:08,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 12:16:08,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 12:16:10,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:11,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:13,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:16:13,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:16:14,848 INFO [train.py:1039] (0/4) Epoch 21, batch 100, loss[loss=0.1609, simple_loss=0.2332, pruned_loss=0.04433, over 23550.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2538, pruned_loss=0.05082, over 1876356.39 frames. ], batch size: 134, lr: 4.96e-03, grad_scale: 8.0 2023-09-30 12:16:16,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:16:18,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:16:22,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:24,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 12:16:24,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:16:30,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:16:30,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:30,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:30,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:16:30,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:33,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 12:16:35,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:16:36,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:36,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:36,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:38,921 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.766e+02 1.906e+02 2.268e+02 3.553e+02, threshold=3.812e+02, percent-clipped=0.0 2023-09-30 12:16:40,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 12:16:42,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:42,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:42,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:16:45,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:16:48,674 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 12:16:48,699 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 12:16:50,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:16:50,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:16:54,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:16:56,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:58,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:03,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:05,283 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 12:17:06,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:17:09,040 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.38 vs. limit=10.0 2023-09-30 12:17:09,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:12,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:13,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:16,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:18,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:19,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:17:21,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:22,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:24,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:24,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:17:24,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:24,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 12:17:24,362 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 12:17:24,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:25,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:17:26,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:26,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:26,062 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 12:17:26,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:17:27,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:17:27,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:29,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:30,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:32,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:17:32,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:17:33,724 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=709213.3333333334, ans=0.5 2023-09-30 12:17:34,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:37,865 INFO [train.py:1039] (0/4) Epoch 21, batch 150, loss[loss=0.1773, simple_loss=0.2508, pruned_loss=0.0519, over 23656.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2537, pruned_loss=0.05045, over 2506688.83 frames. ], batch size: 256, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:17:37,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:37,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:17:38,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:41,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:41,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:43,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:44,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:49,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 12:17:49,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 12:17:50,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 12:17:54,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:17:54,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:17:54,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:55,946 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:55,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:56,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:57,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:00,470 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 12:18:02,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:09,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:12,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:18:14,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 12:18:17,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:18:17,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:17,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:20,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:18:21,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:18:22,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:18:25,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:25,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 12:18:31,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:31,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:18:33,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:18:33,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:18:34,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:36,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 12:18:37,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:18:40,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:18:42,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:44,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:18:44,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 12:18:44,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=709546.6666666666, ans=0.0 2023-09-30 12:18:45,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:45,829 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 12:18:50,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:50,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=709546.6666666666, ans=0.125 2023-09-30 12:18:53,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:53,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:18:56,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 12:18:56,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:57,464 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.34 vs. limit=10.0 2023-09-30 12:18:58,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:00,190 INFO [train.py:1039] (0/4) Epoch 21, batch 200, loss[loss=0.1861, simple_loss=0.2664, pruned_loss=0.05289, over 23969.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2547, pruned_loss=0.05141, over 2996103.81 frames. ], batch size: 80, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:19:00,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 12:19:00,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:19:02,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:03,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:09,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:19:09,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:19:09,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:11,152 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=709613.3333333334, ans=0.0 2023-09-30 12:19:22,965 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.816e+02 2.113e+02 2.431e+02 3.187e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-30 12:19:30,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:19:30,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:19:33,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=709746.6666666666, ans=0.05 2023-09-30 12:19:34,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:19:34,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:19:35,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:19:35,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:19:37,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:39,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:19:39,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:19:39,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:19:40,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 12:19:40,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:19:42,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:42,595 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=709746.6666666666, ans=0.125 2023-09-30 12:19:45,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:19:53,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:20:02,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:02,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:20:03,915 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.65 vs. limit=10.0 2023-09-30 12:20:10,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:13,103 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:20:14,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 12:20:14,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:15,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:20:15,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:17,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:20:20,273 INFO [train.py:1039] (0/4) Epoch 21, batch 250, loss[loss=0.1772, simple_loss=0.2601, pruned_loss=0.04709, over 23953.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2531, pruned_loss=0.05076, over 3376249.48 frames. ], batch size: 86, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:20:20,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 12:20:20,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:20:20,489 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 12:20:21,180 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=4.94 vs. limit=15.0 2023-09-30 12:20:23,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:23,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:20:25,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:26,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:30,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:20:30,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:32,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:20:36,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:20:38,919 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=710013.3333333334, ans=0.025 2023-09-30 12:20:46,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:20:48,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:49,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:20:50,450 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:20:56,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:20:57,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:20:57,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:20:57,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:20:59,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:20:59,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:21:00,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:21:02,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:21:05,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 12:21:06,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:21:08,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:21:09,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:21:09,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:21:09,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:12,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:21:12,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:21:14,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:15,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:21:15,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=710146.6666666666, ans=0.0 2023-09-30 12:21:17,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:21,582 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:21:22,648 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.77 vs. limit=15.0 2023-09-30 12:21:26,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:29,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:21:36,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:37,320 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:21:38,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:21:40,739 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 12:21:42,822 INFO [train.py:1039] (0/4) Epoch 21, batch 300, loss[loss=0.1948, simple_loss=0.2674, pruned_loss=0.06115, over 23306.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2508, pruned_loss=0.05016, over 3665272.49 frames. ], batch size: 105, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:21:43,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:21:43,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:44,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 12:21:44,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:21:46,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:21:46,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 12:21:52,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:53,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:21:56,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:21:58,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 12:21:58,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:22:00,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:22:00,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 12:22:00,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:01,484 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.98 vs. limit=15.0 2023-09-30 12:22:05,041 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.836e+02 2.048e+02 2.213e+02 3.686e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:22:05,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:22:08,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:22:08,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 12:22:08,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=710346.6666666666, ans=0.2 2023-09-30 12:22:14,231 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 12:22:14,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:17,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:18,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:18,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 12:22:18,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:22:21,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:22:24,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:22:24,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:29,525 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:22:29,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 12:22:31,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:22:32,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:34,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 12:22:34,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:38,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:22:41,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:22:41,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 12:22:46,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:46,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:22:47,501 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=10.57 vs. limit=12.0 2023-09-30 12:22:50,547 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:52,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:22:53,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 12:22:53,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:22:54,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:56,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 12:22:56,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:56,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:22:58,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:59,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:59,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:22:59,854 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=710546.6666666666, ans=0.125 2023-09-30 12:23:04,735 INFO [train.py:1039] (0/4) Epoch 21, batch 350, loss[loss=0.1438, simple_loss=0.1932, pruned_loss=0.04726, over 19137.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2499, pruned_loss=0.04975, over 3887777.98 frames. ], batch size: 388, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:23:04,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:04,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:23:09,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:14,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:23:17,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:17,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:21,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 12:23:22,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:23,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 12:23:26,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:26,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 12:23:26,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:31,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 12:23:32,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:23:32,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:34,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:23:37,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:37,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:38,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:23:38,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:38,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:23:41,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:23:41,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:47,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:23:47,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:23:48,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:23:50,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:56,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 12:23:56,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:24:01,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:01,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:01,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:24:03,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 12:24:06,448 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=710813.3333333334, ans=0.0 2023-09-30 12:24:07,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:08,913 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 12:24:10,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 12:24:10,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:15,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:24:15,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 12:24:17,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:18,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:24:22,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:22,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:22,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:25,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:26,558 INFO [train.py:1039] (0/4) Epoch 21, batch 400, loss[loss=0.176, simple_loss=0.2435, pruned_loss=0.0543, over 23714.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2503, pruned_loss=0.04962, over 4083266.56 frames. ], batch size: 232, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:24:26,980 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=710946.6666666666, ans=0.0 2023-09-30 12:24:28,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:24:30,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:24:30,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 12:24:32,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:32,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:34,929 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=710946.6666666666, ans=0.1 2023-09-30 12:24:36,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:24:36,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:39,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:39,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=710946.6666666666, ans=0.0 2023-09-30 12:24:40,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:42,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 12:24:43,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 12:24:43,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:44,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 12:24:44,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:49,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:24:49,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:49,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 12:24:49,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:24:49,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:50,712 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.787e+02 1.990e+02 2.388e+02 3.650e+02, threshold=3.979e+02, percent-clipped=0.0 2023-09-30 12:24:50,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:50,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:52,514 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 12:24:52,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 12:24:57,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:58,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=711013.3333333334, ans=0.025 2023-09-30 12:24:59,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:25:00,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 12:25:00,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 12:25:01,547 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.25 vs. limit=22.5 2023-09-30 12:25:04,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:25:09,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:15,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 12:25:18,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:25:21,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 12:25:22,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:25:25,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:25:25,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 12:25:30,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:25:31,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:25:33,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:25:37,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:37,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 12:25:41,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:25:41,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 12:25:43,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=711213.3333333334, ans=0.0 2023-09-30 12:25:44,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:25:44,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:25:46,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 12:25:47,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:25:49,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:25:49,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:25:50,692 INFO [train.py:1039] (0/4) Epoch 21, batch 450, loss[loss=0.1762, simple_loss=0.2574, pruned_loss=0.04756, over 24703.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2511, pruned_loss=0.04987, over 4233698.14 frames. ], batch size: 68, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:25:50,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 12:25:50,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:25:51,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:25:52,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:25:52,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 12:25:54,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:25:56,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:25:57,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:26:08,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:10,439 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:12,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 12:26:12,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 12:26:12,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=711346.6666666666, ans=0.025 2023-09-30 12:26:14,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=711346.6666666666, ans=0.0 2023-09-30 12:26:15,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:26:17,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:19,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:22,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:24,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:28,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 12:26:28,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 12:26:29,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 12:26:29,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:26:31,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:32,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:26:34,460 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 12:26:34,475 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 12:26:34,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:36,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:26:37,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:26:42,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:26:44,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:26:44,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:26:44,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 12:26:49,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:51,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:26:51,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:26:54,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 12:26:56,881 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=711546.6666666666, ans=0.125 2023-09-30 12:26:58,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:26:58,842 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.00 vs. limit=10.0 2023-09-30 12:26:59,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 12:27:00,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 12:27:01,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:27:07,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:27:09,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:10,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:27:10,906 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 12:27:13,759 INFO [train.py:1039] (0/4) Epoch 21, batch 500, loss[loss=0.179, simple_loss=0.2497, pruned_loss=0.05414, over 23771.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2513, pruned_loss=0.04943, over 4357448.26 frames. ], batch size: 212, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:27:15,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:17,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:27:17,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:17,283 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 12:27:19,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 12:27:19,354 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:19,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=711613.3333333334, ans=0.1 2023-09-30 12:27:21,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:27:25,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:27:28,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:27:31,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:31,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:32,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:37,394 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.852e+02 2.048e+02 2.259e+02 3.327e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:27:41,581 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=711680.0, ans=0.2 2023-09-30 12:27:42,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:42,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:27:42,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:27:42,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:44,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 12:27:44,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:27:46,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:27:46,341 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=711746.6666666666, ans=0.1 2023-09-30 12:27:47,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:27:48,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:27:48,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:49,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 12:27:52,584 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 12:27:56,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:27:56,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:57,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:28:02,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 12:28:05,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:28:06,218 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=711813.3333333334, ans=0.1 2023-09-30 12:28:07,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:10,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:14,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:28:19,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:20,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 12:28:20,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:20,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:23,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 12:28:25,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:28:27,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:33,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 12:28:34,562 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.70 vs. limit=15.0 2023-09-30 12:28:35,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 12:28:35,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:35,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 12:28:35,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:28:35,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:37,325 INFO [train.py:1039] (0/4) Epoch 21, batch 550, loss[loss=0.1674, simple_loss=0.2557, pruned_loss=0.03956, over 24617.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2524, pruned_loss=0.05048, over 4409763.29 frames. ], batch size: 68, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:28:37,425 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:28:39,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:28:39,250 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=711946.6666666666, ans=0.2 2023-09-30 12:28:42,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:43,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 12:28:43,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:28:47,393 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=711946.6666666666, ans=0.125 2023-09-30 12:28:49,191 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:28:49,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:52,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:28:53,111 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.22 vs. limit=15.0 2023-09-30 12:28:55,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:55,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=712013.3333333334, ans=0.125 2023-09-30 12:28:59,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 12:28:59,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 12:29:02,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:29:06,871 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=712013.3333333334, ans=0.07 2023-09-30 12:29:08,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:29:08,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:11,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:29:15,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:15,091 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 12:29:15,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:29:16,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:29:19,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:19,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:29:21,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:29:21,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:23,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 12:29:26,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 12:29:27,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:27,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:29:29,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:29:29,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:29:33,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:29:35,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:29:37,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:29:37,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:39,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 12:29:41,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:29:42,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:44,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:29:45,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:46,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=712213.3333333334, ans=0.2 2023-09-30 12:29:47,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:29:47,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:29:52,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 12:29:54,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 12:29:58,033 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:29:58,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:29:58,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:58,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=712280.0, ans=0.125 2023-09-30 12:29:59,458 INFO [train.py:1039] (0/4) Epoch 21, batch 600, loss[loss=0.1753, simple_loss=0.2603, pruned_loss=0.04511, over 24439.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2527, pruned_loss=0.05026, over 4489838.32 frames. ], batch size: 69, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:30:05,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:30:07,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:30:09,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 12:30:11,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:30:12,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:14,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:17,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 12:30:17,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:30:21,802 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.803e+02 1.996e+02 2.235e+02 3.480e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 12:30:24,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 12:30:26,198 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=712346.6666666666, ans=0.1 2023-09-30 12:30:27,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:30:27,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:27,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:30:34,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:30:34,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:30:34,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:40,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:30:41,096 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.95 vs. limit=15.0 2023-09-30 12:30:44,454 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:44,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:44,473 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:46,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=712413.3333333334, ans=15.0 2023-09-30 12:30:53,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 12:31:00,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:31:00,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:04,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 12:31:05,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:31:08,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 12:31:08,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:31:08,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:31:14,078 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.27 vs. limit=15.0 2023-09-30 12:31:15,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:31:15,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:31:17,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:31:19,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:31:21,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:22,642 INFO [train.py:1039] (0/4) Epoch 21, batch 650, loss[loss=0.1765, simple_loss=0.2561, pruned_loss=0.04845, over 24501.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2516, pruned_loss=0.05026, over 4514347.33 frames. ], batch size: 66, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:31:24,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 12:31:24,396 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=712613.3333333334, ans=0.1 2023-09-30 12:31:25,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:31:26,405 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.88 vs. limit=15.0 2023-09-30 12:31:31,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:31:31,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:36,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:40,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 12:31:43,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:31:43,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:48,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:50,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:31:54,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:54,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:56,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:31:57,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:58,553 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-09-30 12:31:59,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:32:01,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:32:01,075 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 12:32:01,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:01,135 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:01,722 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.29 vs. limit=15.0 2023-09-30 12:32:04,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:05,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:07,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:07,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:32:09,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 12:32:09,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:32:09,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:32:11,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:32:11,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:14,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:32:14,236 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 12:32:15,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 12:32:15,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:15,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:15,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:32:15,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:32:18,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:32:27,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:27,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:32:28,954 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:31,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:31,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:32:33,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:41,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:32:41,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:42,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:32:43,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:45,052 INFO [train.py:1039] (0/4) Epoch 21, batch 700, loss[loss=0.1608, simple_loss=0.2431, pruned_loss=0.03928, over 24599.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2497, pruned_loss=0.04931, over 4555429.84 frames. ], batch size: 68, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:32:45,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=712946.6666666666, ans=0.1 2023-09-30 12:32:46,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 12:32:48,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 12:32:51,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 12:32:51,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:51,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=712946.6666666666, ans=0.1 2023-09-30 12:32:51,753 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=712946.6666666666, ans=0.125 2023-09-30 12:32:52,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:32:54,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 12:32:59,284 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:02,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:33:04,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:06,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:33:06,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=713013.3333333334, ans=0.0 2023-09-30 12:33:07,181 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.845e+02 2.008e+02 2.196e+02 3.321e+02, threshold=4.016e+02, percent-clipped=0.0 2023-09-30 12:33:07,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:33:10,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:12,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:33:12,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:33:13,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 12:33:17,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 12:33:21,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:33:21,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:33:23,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:33:26,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:33:26,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 12:33:31,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:31,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:33:34,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 12:33:35,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:37,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:38,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:33:43,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:33:43,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 12:33:49,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 12:33:49,408 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 12:33:52,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:54,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:33:55,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:33:58,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:58,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 12:34:03,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 12:34:03,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 12:34:03,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 12:34:05,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 12:34:06,589 INFO [train.py:1039] (0/4) Epoch 21, batch 750, loss[loss=0.1741, simple_loss=0.2428, pruned_loss=0.05268, over 23831.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2502, pruned_loss=0.04971, over 4587698.93 frames. ], batch size: 212, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:34:06,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 12:34:06,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:34:08,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 12:34:10,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:34:10,832 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=713280.0, ans=0.1 2023-09-30 12:34:11,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:12,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:14,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:16,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:34:16,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:19,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:34:19,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:34:21,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:34:26,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:26,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:26,517 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=713346.6666666666, ans=0.125 2023-09-30 12:34:27,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 12:34:29,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:34:29,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:30,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:32,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=713346.6666666666, ans=0.0 2023-09-30 12:34:33,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:34:35,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 12:34:35,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:34:39,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 12:34:39,197 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 12:34:40,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 12:34:40,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:34:40,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:34:43,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:34:51,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:51,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:34:51,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:34:53,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:53,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:53,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 12:34:55,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:34:56,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:34:57,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:35:02,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:35:02,299 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=713480.0, ans=0.125 2023-09-30 12:35:04,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 12:35:05,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:10,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:11,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:35:12,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:14,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:35:18,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 12:35:18,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:18,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:23,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:23,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:25,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:25,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:35:30,180 INFO [train.py:1039] (0/4) Epoch 21, batch 800, loss[loss=0.2005, simple_loss=0.2629, pruned_loss=0.06901, over 23796.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2505, pruned_loss=0.04947, over 4627267.23 frames. ], batch size: 212, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:35:36,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:36,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:40,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:40,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:40,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:42,292 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:43,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:49,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:49,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:35:52,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 12:35:52,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:52,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=713680.0, ans=0.0 2023-09-30 12:35:52,897 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.36 vs. limit=22.5 2023-09-30 12:35:53,664 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.844e+02 2.013e+02 2.212e+02 3.409e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 12:35:53,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:53,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:35:53,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:35:55,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 12:35:55,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:55,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 12:36:00,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:02,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:05,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:36:05,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:36:06,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:06,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:13,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:36:13,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:36:13,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 12:36:15,715 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 12:36:15,760 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 12:36:15,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:36:15,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:18,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:18,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:36:24,037 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 12:36:25,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 12:36:26,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:36:28,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:36:33,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:36:36,835 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:38,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 12:36:38,422 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:36:40,171 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=713880.0, ans=0.125 2023-09-30 12:36:42,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 12:36:49,940 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=713880.0, ans=0.125 2023-09-30 12:36:51,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:36:53,262 INFO [train.py:1039] (0/4) Epoch 21, batch 850, loss[loss=0.1841, simple_loss=0.2585, pruned_loss=0.05481, over 23454.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2513, pruned_loss=0.04981, over 4651703.23 frames. ], batch size: 106, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:36:53,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:36:53,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 12:36:54,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:36:54,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:56,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 12:36:56,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:58,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:58,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:00,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:37:01,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:37:04,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 12:37:04,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 12:37:04,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 12:37:06,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:37:06,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:06,900 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=713946.6666666666, ans=0.1 2023-09-30 12:37:08,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:09,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:37:09,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:37:15,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:15,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:15,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 12:37:16,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=714013.3333333334, ans=0.125 2023-09-30 12:37:19,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 12:37:20,508 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.46 vs. limit=10.0 2023-09-30 12:37:23,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:25,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 12:37:28,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 12:37:29,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 12:37:33,309 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 12:37:33,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:33,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:37:33,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:37:36,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 12:37:40,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:42,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:42,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:37:44,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:37:44,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:37:46,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:37:46,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 12:37:51,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:37:51,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:37:51,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=714146.6666666666, ans=0.125 2023-09-30 12:37:52,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:37:52,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:52,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:55,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:56,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:37:58,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:37:59,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:00,192 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=714213.3333333334, ans=0.125 2023-09-30 12:38:01,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:38:05,939 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.81 vs. limit=15.0 2023-09-30 12:38:06,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=714213.3333333334, ans=0.125 2023-09-30 12:38:09,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:38:11,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:38:11,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 12:38:12,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:12,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:38:14,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 12:38:15,646 INFO [train.py:1039] (0/4) Epoch 21, batch 900, loss[loss=0.1887, simple_loss=0.2543, pruned_loss=0.06158, over 23730.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2533, pruned_loss=0.05107, over 4647444.69 frames. ], batch size: 232, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:38:19,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:38:20,213 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:38:21,461 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=714280.0, ans=0.125 2023-09-30 12:38:22,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:24,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 12:38:29,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:38:29,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 12:38:31,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:38:32,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:32,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:32,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:38:32,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:38:33,255 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=714346.6666666666, ans=0.125 2023-09-30 12:38:36,788 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.58 vs. limit=22.5 2023-09-30 12:38:39,495 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.822e+02 2.031e+02 2.211e+02 2.952e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 12:38:42,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:38:42,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:42,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:38:44,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:49,745 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.09 vs. limit=10.0 2023-09-30 12:38:51,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 12:38:53,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:38:59,002 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=714413.3333333334, ans=0.0 2023-09-30 12:39:00,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:39:00,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:39:01,942 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 12:39:02,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 12:39:05,048 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:39:09,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:39:09,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:39:10,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:39:19,055 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:19,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:39:20,843 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=714546.6666666666, ans=0.125 2023-09-30 12:39:21,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 12:39:21,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:39:25,112 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 12:39:26,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:39:26,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:28,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:39:29,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:39:35,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 12:39:35,190 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 12:39:36,786 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:39:36,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 12:39:38,195 INFO [train.py:1039] (0/4) Epoch 21, batch 950, loss[loss=0.1543, simple_loss=0.2296, pruned_loss=0.03947, over 24308.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2536, pruned_loss=0.05109, over 4663570.73 frames. ], batch size: 56, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:39:39,844 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:45,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 12:39:48,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:39:52,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:52,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:54,347 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:39:57,183 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 12:40:01,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:01,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:02,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:02,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:40:02,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=714680.0, ans=0.2 2023-09-30 12:40:03,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 12:40:03,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:40:05,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:07,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 12:40:07,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:12,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:13,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:13,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:40:15,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 12:40:17,591 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:40:19,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:21,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:40:23,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=714746.6666666666, ans=0.1 2023-09-30 12:40:26,534 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:40:26,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:29,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 12:40:31,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:40:31,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:40:33,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=714813.3333333334, ans=0.1 2023-09-30 12:40:34,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:34,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:34,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:40:37,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 12:40:38,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:40:40,635 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:42,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:42,740 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 12:40:42,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:42,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:40:42,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 12:40:47,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:40:51,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:53,666 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:40:54,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:40:56,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 12:40:56,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 12:40:59,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:41:02,682 INFO [train.py:1039] (0/4) Epoch 21, batch 1000, loss[loss=0.1785, simple_loss=0.2593, pruned_loss=0.04889, over 23364.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2521, pruned_loss=0.05066, over 4669239.44 frames. ], batch size: 105, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:41:02,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 12:41:02,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:04,772 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=714946.6666666666, ans=0.125 2023-09-30 12:41:08,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:41:10,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 12:41:10,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 12:41:15,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:15,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:41:18,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:21,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 12:41:25,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 12:41:26,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=715013.3333333334, ans=15.0 2023-09-30 12:41:27,770 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.831e+02 2.095e+02 2.362e+02 3.753e+02, threshold=4.190e+02, percent-clipped=0.0 2023-09-30 12:41:27,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 12:41:29,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:30,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 12:41:33,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 12:41:33,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 12:41:34,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:35,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:45,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:45,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:41:46,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:48,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:48,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 12:41:48,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:50,083 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:41:51,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:51,665 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 12:41:54,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 12:41:55,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.84 vs. limit=10.0 2023-09-30 12:41:56,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 12:41:59,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 12:42:01,162 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.10 vs. limit=15.0 2023-09-30 12:42:02,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:42:08,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:08,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:42:10,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:10,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:42:12,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 12:42:13,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:42:13,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 12:42:13,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 12:42:15,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:15,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:42:18,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:42:21,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:42:23,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:42:24,982 INFO [train.py:1039] (0/4) Epoch 21, batch 1050, loss[loss=0.1626, simple_loss=0.2506, pruned_loss=0.03732, over 24323.00 frames. ], tot_loss[loss=0.175, simple_loss=0.25, pruned_loss=0.04996, over 4672049.39 frames. ], batch size: 74, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:42:25,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:42:25,841 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.32 vs. limit=12.0 2023-09-30 12:42:26,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:42:28,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:42:29,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:32,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:42:35,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:42:36,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:42:39,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:42:40,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:42:40,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:42:41,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:42:43,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 12:42:43,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:43,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 12:42:44,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:44,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 12:42:46,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:42:52,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:52,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:42:52,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 12:42:57,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 12:42:59,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:43:00,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 12:43:04,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 12:43:06,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:10,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:43:11,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:43:13,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:43:13,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:43:13,414 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=715480.0, ans=0.125 2023-09-30 12:43:16,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:43:19,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 12:43:21,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 12:43:21,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 12:43:22,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:22,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:43:24,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 12:43:29,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:43:32,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:32,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:43:32,406 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:32,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:34,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=715546.6666666666, ans=0.0 2023-09-30 12:43:37,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:37,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 12:43:40,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:40,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 12:43:41,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 12:43:42,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:43:45,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:43:47,960 INFO [train.py:1039] (0/4) Epoch 21, batch 1100, loss[loss=0.1782, simple_loss=0.2637, pruned_loss=0.04639, over 23976.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2502, pruned_loss=0.04988, over 4681290.36 frames. ], batch size: 80, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:43:51,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:43:57,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:43:58,158 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.44 vs. limit=22.5 2023-09-30 12:43:58,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:43:58,759 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:00,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 12:44:01,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:44:05,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:44:07,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:44:10,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:44:10,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 12:44:11,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:44:13,692 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.860e+02 2.075e+02 2.430e+02 4.755e+02, threshold=4.150e+02, percent-clipped=2.0 2023-09-30 12:44:13,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:13,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:44:17,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:44:19,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:44:21,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=715746.6666666666, ans=0.1 2023-09-30 12:44:24,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:44:28,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 12:44:28,946 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 12:44:30,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:44:35,283 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:44:38,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 12:44:38,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:44:38,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:44:38,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:44:38,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:40,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 12:44:46,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:44:46,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 12:44:50,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:44:50,904 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=715813.3333333334, ans=0.0 2023-09-30 12:44:50,962 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:44:53,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:44:56,605 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 12:44:56,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:44:58,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:00,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:00,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:03,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 12:45:04,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:45:04,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:06,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 12:45:06,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:45:07,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 12:45:07,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:45:08,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:45:09,407 INFO [train.py:1039] (0/4) Epoch 21, batch 1150, loss[loss=0.1814, simple_loss=0.2693, pruned_loss=0.04674, over 24302.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2509, pruned_loss=0.0497, over 4695268.20 frames. ], batch size: 74, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:45:09,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:45:16,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:19,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:45:21,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:21,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:45:21,333 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 12:45:22,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:25,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 12:45:26,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:26,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:45:26,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=716013.3333333334, ans=0.125 2023-09-30 12:45:31,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 12:45:33,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:36,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=716013.3333333334, ans=0.0 2023-09-30 12:45:38,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:38,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:45:38,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 12:45:38,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:45:38,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:44,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 12:45:44,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=716080.0, ans=0.125 2023-09-30 12:45:45,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:46,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=716080.0, ans=0.0 2023-09-30 12:45:48,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:53,332 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716080.0, ans=0.1 2023-09-30 12:45:56,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:46:01,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:46:01,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 12:46:02,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:03,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:10,486 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.90 vs. limit=12.0 2023-09-30 12:46:11,466 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:46:12,681 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 12:46:14,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:16,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=716213.3333333334, ans=0.0 2023-09-30 12:46:22,554 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 12:46:25,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:27,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:46:27,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:46:27,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:46:30,680 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=716280.0, ans=0.0 2023-09-30 12:46:32,261 INFO [train.py:1039] (0/4) Epoch 21, batch 1200, loss[loss=0.182, simple_loss=0.2546, pruned_loss=0.05469, over 23745.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2521, pruned_loss=0.05006, over 4700732.04 frames. ], batch size: 232, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:46:32,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:37,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:46:37,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:46:40,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:46:40,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:40,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:46:42,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:46:44,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:46:47,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:47,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:50,443 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 12:46:52,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 12:46:53,877 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=716346.6666666666, ans=0.09899494936611666 2023-09-30 12:46:57,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:46:58,563 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.836e+02 2.061e+02 2.415e+02 4.765e+02, threshold=4.121e+02, percent-clipped=1.0 2023-09-30 12:47:00,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:47:01,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:03,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:03,339 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 12:47:04,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:12,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:47:12,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:47:12,288 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=716413.3333333334, ans=0.05 2023-09-30 12:47:13,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 12:47:13,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=716413.3333333334, ans=10.0 2023-09-30 12:47:14,177 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.58 vs. limit=22.5 2023-09-30 12:47:14,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:47:19,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 12:47:23,671 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.89 vs. limit=15.0 2023-09-30 12:47:24,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 12:47:24,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:25,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:47:27,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:28,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:47:30,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:30,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:47:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:47:32,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 12:47:34,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:47:34,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:34,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:47:37,430 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:47:37,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:41,294 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:47:42,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:47:46,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 12:47:49,779 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 12:47:52,608 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:53,989 INFO [train.py:1039] (0/4) Epoch 21, batch 1250, loss[loss=0.1814, simple_loss=0.2698, pruned_loss=0.04653, over 24568.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2531, pruned_loss=0.04991, over 4709473.92 frames. ], batch size: 71, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:47:54,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:57,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:47:59,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:48:03,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 12:48:05,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:48:07,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:07,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 12:48:10,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:48:12,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:48:15,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:48:17,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:17,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:48:17,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:21,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:48:24,398 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=716680.0, ans=0.125 2023-09-30 12:48:25,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:48:25,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:48:25,603 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:48:27,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:28,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:28,829 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=716746.6666666666, ans=0.1 2023-09-30 12:48:31,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:33,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:48:37,433 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.64 vs. limit=12.0 2023-09-30 12:48:38,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 12:48:39,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:48:42,231 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=716813.3333333334, ans=0.1 2023-09-30 12:48:43,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:48:44,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 12:48:45,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:45,066 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 12:48:45,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:45,103 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:50,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:52,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:52,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:48:53,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 12:48:53,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 12:48:55,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 12:48:58,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:48:59,769 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.45 vs. limit=15.0 2023-09-30 12:49:00,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 12:49:00,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:03,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:49:03,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:49:05,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 12:49:05,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:49:07,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:49:07,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:49:07,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:07,750 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=716880.0, ans=0.0 2023-09-30 12:49:10,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 12:49:12,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:14,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:49:14,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:49:17,285 INFO [train.py:1039] (0/4) Epoch 21, batch 1300, loss[loss=0.1579, simple_loss=0.237, pruned_loss=0.03941, over 24316.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2535, pruned_loss=0.05002, over 4717665.59 frames. ], batch size: 61, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:49:17,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:49:20,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:21,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 12:49:27,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:28,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:49:30,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:49:30,666 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=716946.6666666666, ans=0.1 2023-09-30 12:49:31,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:31,804 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:49:33,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 12:49:33,553 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=717013.3333333334, ans=0.125 2023-09-30 12:49:37,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:49:39,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:49:40,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 12:49:43,779 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.834e+02 2.043e+02 2.344e+02 3.785e+02, threshold=4.086e+02, percent-clipped=0.0 2023-09-30 12:49:45,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:49:50,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:50,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:51,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:53,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:54,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:49:56,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:49:56,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 12:49:56,801 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=717080.0, ans=0.125 2023-09-30 12:50:02,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:50:02,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:50:05,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 12:50:05,267 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:50:06,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:50:08,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:50:09,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 12:50:10,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:11,492 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 12:50:12,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:16,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:50:16,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:50:22,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 12:50:22,150 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 12:50:25,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 12:50:28,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:50:31,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 12:50:33,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:39,719 INFO [train.py:1039] (0/4) Epoch 21, batch 1350, loss[loss=0.1675, simple_loss=0.2386, pruned_loss=0.04822, over 23562.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2516, pruned_loss=0.04976, over 4714492.48 frames. ], batch size: 149, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:50:39,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 12:50:42,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:44,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:50:46,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:48,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:50,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:50:52,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:55,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:56,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 12:50:58,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:50:59,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:51:00,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=717346.6666666666, ans=0.125 2023-09-30 12:51:02,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 12:51:04,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:51:04,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:51:04,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 12:51:06,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 12:51:09,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 12:51:12,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:12,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 12:51:24,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:29,901 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.40 vs. limit=22.5 2023-09-30 12:51:33,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:35,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:35,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 12:51:38,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:41,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 12:51:41,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:51:41,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:51:45,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:51:47,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 12:51:48,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:51:54,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 12:51:54,353 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=717546.6666666666, ans=0.125 2023-09-30 12:51:55,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 12:52:01,868 INFO [train.py:1039] (0/4) Epoch 21, batch 1400, loss[loss=0.1524, simple_loss=0.2299, pruned_loss=0.03741, over 24298.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.251, pruned_loss=0.04886, over 4729383.37 frames. ], batch size: 56, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:52:02,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 12:52:04,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:52:04,702 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.30 vs. limit=22.5 2023-09-30 12:52:07,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:52:08,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:52:12,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 12:52:13,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 12:52:24,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:52:28,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:30,105 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.890e+02 2.143e+02 2.435e+02 3.256e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-30 12:52:30,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:52:30,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:52:35,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:52:36,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:52:44,391 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=717746.6666666666, ans=0.0 2023-09-30 12:52:46,258 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.12 vs. limit=15.0 2023-09-30 12:52:48,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:48,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:55,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 12:52:55,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:52:55,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:52:57,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:52:57,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:58,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:52:58,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:52:58,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:53:01,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 12:53:01,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:53:04,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=717813.3333333334, ans=0.1 2023-09-30 12:53:06,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:10,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:53:17,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 12:53:18,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:53:20,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:53:21,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:53:23,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:25,612 INFO [train.py:1039] (0/4) Epoch 21, batch 1450, loss[loss=0.1555, simple_loss=0.2374, pruned_loss=0.03683, over 24661.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2506, pruned_loss=0.04853, over 4729196.84 frames. ], batch size: 65, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:53:25,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:53:28,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:53:31,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:53:31,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:32,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:53:35,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=717946.6666666666, ans=0.125 2023-09-30 12:53:38,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:39,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:53:41,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:53:41,356 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 12:53:42,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:53:45,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 12:53:46,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:46,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:46,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 12:53:48,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:53:49,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:53:49,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 12:53:49,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:51,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:53:53,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:56,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:56,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=718080.0, ans=0.1 2023-09-30 12:54:00,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:54:00,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:54:03,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:54:03,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:04,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:54:04,916 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:54:04,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:06,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:11,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 12:54:12,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:54:17,837 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 12:54:19,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:20,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:54:22,367 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:22,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 12:54:27,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:27,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 12:54:31,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 12:54:31,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:34,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=718213.3333333334, ans=0.0 2023-09-30 12:54:35,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:54:35,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:38,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 12:54:41,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 12:54:41,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 12:54:42,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:44,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:54:47,193 INFO [train.py:1039] (0/4) Epoch 21, batch 1500, loss[loss=0.1567, simple_loss=0.2262, pruned_loss=0.04357, over 24426.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2511, pruned_loss=0.04873, over 4734234.22 frames. ], batch size: 58, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:54:52,977 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=718280.0, ans=0.125 2023-09-30 12:54:55,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 12:54:55,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:54:55,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:54:57,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:57,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:54:59,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:54:59,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 12:55:01,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:55:01,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:55:01,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:55:03,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:55:03,890 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=718346.6666666666, ans=0.125 2023-09-30 12:55:05,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:05,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=718346.6666666666, ans=0.025 2023-09-30 12:55:06,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:13,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:13,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 12:55:14,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:55:16,112 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.806e+02 1.957e+02 2.271e+02 3.905e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 12:55:16,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:55:17,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:20,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 12:55:24,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 12:55:26,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:55:26,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 12:55:29,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:55:30,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:55:32,301 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:32,324 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:55:32,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 12:55:33,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:55:33,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:36,088 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 12:55:36,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:41,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:55:41,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 12:55:48,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:55:48,941 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.39 vs. limit=15.0 2023-09-30 12:55:49,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:55:50,296 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=718480.0, ans=0.125 2023-09-30 12:55:52,963 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 12:55:53,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:55:53,040 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 12:55:54,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:54,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=718546.6666666666, ans=0.0 2023-09-30 12:55:56,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:55:58,040 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 12:55:58,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=718546.6666666666, ans=0.1 2023-09-30 12:55:59,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:56:01,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 12:56:02,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:06,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:07,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:08,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:09,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:09,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:56:09,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 12:56:11,101 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.55 vs. limit=15.0 2023-09-30 12:56:11,618 INFO [train.py:1039] (0/4) Epoch 21, batch 1550, loss[loss=0.1424, simple_loss=0.223, pruned_loss=0.03089, over 24325.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2517, pruned_loss=0.04904, over 4723219.72 frames. ], batch size: 61, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:56:11,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 12:56:11,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:56:13,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 12:56:14,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 12:56:16,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:18,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:19,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:19,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:56:21,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:23,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:25,254 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 12:56:25,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:26,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:56:28,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:56:29,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:56:29,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 12:56:30,551 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.40 vs. limit=22.5 2023-09-30 12:56:31,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:33,466 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 12:56:33,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 12:56:35,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 12:56:35,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:35,352 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=718680.0, ans=0.125 2023-09-30 12:56:35,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=718680.0, ans=0.125 2023-09-30 12:56:38,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:41,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:56:44,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 12:56:44,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 12:56:54,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:57,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:57,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:56:57,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:56:58,163 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=718746.6666666666, ans=0.125 2023-09-30 12:56:59,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 12:57:04,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:57:06,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:11,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:57:14,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:57:14,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:57:15,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 12:57:15,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:16,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:57:17,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:17,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 12:57:17,623 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 12:57:22,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:27,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 12:57:28,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=718880.0, ans=0.125 2023-09-30 12:57:28,155 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=718880.0, ans=0.1 2023-09-30 12:57:31,384 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=718880.0, ans=0.2 2023-09-30 12:57:32,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:34,722 INFO [train.py:1039] (0/4) Epoch 21, batch 1600, loss[loss=0.1768, simple_loss=0.2638, pruned_loss=0.04491, over 24417.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2525, pruned_loss=0.04968, over 4732694.36 frames. ], batch size: 69, lr: 4.92e-03, grad_scale: 16.0 2023-09-30 12:57:34,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:34,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 12:57:35,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=718946.6666666666, ans=0.125 2023-09-30 12:57:36,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:36,533 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=718946.6666666666, ans=0.125 2023-09-30 12:57:37,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:37,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:57:38,084 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=718946.6666666666, ans=0.025 2023-09-30 12:57:39,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:57:39,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:57:41,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:43,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 12:57:44,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 12:57:47,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 12:57:50,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:57:52,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 12:57:52,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:57:56,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:57:59,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:58:01,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 12:58:04,559 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.831e+02 2.011e+02 2.218e+02 3.597e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:58:04,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:58:06,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 12:58:06,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:07,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 12:58:11,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 12:58:20,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:21,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 12:58:21,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:23,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:58:23,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:58:24,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 12:58:28,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 12:58:30,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:58:30,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:31,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:33,782 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:58:34,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:58:36,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:58:37,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:58:43,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:43,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:58:47,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 12:58:47,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:58:48,805 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 12:58:53,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:58:56,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:58:56,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:58:56,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 12:58:56,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 12:58:56,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 12:58:56,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 12:58:57,951 INFO [train.py:1039] (0/4) Epoch 21, batch 1650, loss[loss=0.1809, simple_loss=0.2546, pruned_loss=0.05355, over 24015.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2534, pruned_loss=0.05027, over 4729195.17 frames. ], batch size: 80, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:59:01,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:59:01,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:01,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:03,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:59:06,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:08,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 12:59:13,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:59:13,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:13,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:59:13,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:59:15,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 12:59:15,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 12:59:17,017 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=719346.6666666666, ans=0.0 2023-09-30 12:59:21,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:59:24,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:59:32,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 12:59:34,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:35,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 12:59:40,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:59:42,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:59:44,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:59:46,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:59:47,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:59:47,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:48,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:49,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:49,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:50,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:59:52,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:52,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:59:56,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:57,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 13:00:00,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:00:00,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 13:00:02,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 13:00:02,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 13:00:02,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:03,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:00:03,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:05,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:00:05,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 13:00:08,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:12,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:00:12,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:15,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 13:00:20,671 INFO [train.py:1039] (0/4) Epoch 21, batch 1700, loss[loss=0.1836, simple_loss=0.2638, pruned_loss=0.05169, over 23413.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2531, pruned_loss=0.04998, over 4725231.66 frames. ], batch size: 106, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:00:20,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:20,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:00:20,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 13:00:20,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:21,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:00:21,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:24,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:00:24,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:00:25,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 13:00:27,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:00:37,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:40,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:00:45,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:00:45,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:00:45,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:47,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:00:49,899 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.892e+02 2.034e+02 2.348e+02 3.587e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 13:00:50,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 13:00:51,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:00:51,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:52,649 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.85 vs. limit=22.5 2023-09-30 13:00:55,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:00:55,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:00:58,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 13:00:59,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 13:01:00,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:02,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 13:01:03,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:01:03,943 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=719746.6666666666, ans=0.125 2023-09-30 13:01:14,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:15,472 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.24 vs. limit=22.5 2023-09-30 13:01:16,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:16,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:01:18,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:01:18,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 13:01:18,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:01:18,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=719813.3333333334, ans=0.125 2023-09-30 13:01:21,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:21,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 13:01:21,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:01:21,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:23,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:23,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:24,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:24,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:01:26,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:26,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:01:26,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:33,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:34,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 13:01:36,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:38,413 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:41,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 13:01:42,778 INFO [train.py:1039] (0/4) Epoch 21, batch 1750, loss[loss=0.1574, simple_loss=0.2311, pruned_loss=0.04184, over 20686.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2513, pruned_loss=0.04971, over 4721162.24 frames. ], batch size: 45, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:01:43,283 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=719946.6666666666, ans=0.09899494936611666 2023-09-30 13:01:46,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:47,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:49,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:01:49,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 13:01:50,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:54,520 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-108000.pt 2023-09-30 13:01:57,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:01:57,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:02,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 13:02:03,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:06,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 13:02:06,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:08,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:02:09,042 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=720013.3333333334, ans=15.0 2023-09-30 13:02:11,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:02:13,681 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 13:02:15,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:02:16,616 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 13:02:17,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=720013.3333333334, ans=0.125 2023-09-30 13:02:24,524 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:02:27,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:02:27,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:27,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=720080.0, ans=0.0 2023-09-30 13:02:31,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:31,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:32,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:02:34,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:35,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:36,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:38,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 13:02:39,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:41,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 13:02:41,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:43,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:43,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=720146.6666666666, ans=0.125 2023-09-30 13:02:45,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:02:48,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:02:48,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 13:02:48,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:51,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:52,130 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=720213.3333333334, ans=0.125 2023-09-30 13:02:56,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:56,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=720213.3333333334, ans=0.015 2023-09-30 13:02:59,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:01,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:03:01,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 13:03:01,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:03,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:03:03,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:03,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:03:03,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:03:04,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:03:09,630 INFO [train.py:1039] (0/4) Epoch 21, batch 1800, loss[loss=0.195, simple_loss=0.2461, pruned_loss=0.07192, over 19141.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2499, pruned_loss=0.04967, over 4695827.32 frames. ], batch size: 388, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:03:09,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:03:09,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:03:11,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:03:13,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:18,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:03:18,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:03:21,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:24,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:24,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:26,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:03:27,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:27,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 13:03:29,373 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:32,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:37,722 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 13:03:39,072 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.961e+02 2.256e+02 2.662e+02 3.514e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-30 13:03:39,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 13:03:40,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 13:03:40,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:41,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:41,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:03:41,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=720413.3333333334, ans=0.0 2023-09-30 13:03:43,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:03:43,510 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=720413.3333333334, ans=0.125 2023-09-30 13:03:51,544 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 13:03:51,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=720413.3333333334, ans=0.125 2023-09-30 13:03:53,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=720413.3333333334, ans=0.1 2023-09-30 13:03:54,428 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:03:56,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:58,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 13:03:58,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 13:03:59,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:04:01,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:04:02,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:04:07,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 13:04:14,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:04:14,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 13:04:15,602 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.54 vs. limit=15.0 2023-09-30 13:04:16,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:04:16,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:17,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:04:17,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 13:04:20,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:04:20,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:23,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 13:04:23,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:26,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:26,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:04:26,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:27,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:29,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:04:30,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:04:30,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:31,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=720613.3333333334, ans=0.125 2023-09-30 13:04:32,733 INFO [train.py:1039] (0/4) Epoch 21, batch 1850, loss[loss=0.2198, simple_loss=0.2742, pruned_loss=0.08271, over 19728.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2509, pruned_loss=0.05016, over 4687396.54 frames. ], batch size: 388, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:04:34,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:04:36,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:04:45,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:04:45,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 13:04:45,898 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=720613.3333333334, ans=0.125 2023-09-30 13:04:50,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 13:04:55,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 13:04:57,388 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=720680.0, ans=0.125 2023-09-30 13:04:58,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:59,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 13:05:00,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:05:04,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=720746.6666666666, ans=0.125 2023-09-30 13:05:07,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:05:07,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=720746.6666666666, ans=0.09899494936611666 2023-09-30 13:05:07,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=720746.6666666666, ans=0.07 2023-09-30 13:05:08,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 13:05:09,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=720746.6666666666, ans=0.125 2023-09-30 13:05:09,219 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=720746.6666666666, ans=0.05 2023-09-30 13:05:12,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:12,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:15,927 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=720746.6666666666, ans=0.09899494936611666 2023-09-30 13:05:17,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 13:05:17,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:17,166 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:05:19,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:05:20,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:05:21,168 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=720813.3333333334, ans=0.05 2023-09-30 13:05:24,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:05:27,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:05:27,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:27,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:05:27,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:05:30,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:32,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:05:36,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 13:05:36,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:41,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:05:41,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:05:41,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 13:05:41,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 13:05:44,627 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 13:05:44,747 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 13:05:47,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:05:47,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:47,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:05:47,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:49,260 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 13:05:49,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:05:50,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:50,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:05:53,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:05:54,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:54,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 13:05:56,026 INFO [train.py:1039] (0/4) Epoch 21, batch 1900, loss[loss=0.1766, simple_loss=0.2645, pruned_loss=0.0443, over 24327.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2512, pruned_loss=0.04944, over 4710554.07 frames. ], batch size: 74, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:05:57,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:57,815 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 13:05:57,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:05:59,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:05:59,673 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=720946.6666666666, ans=0.125 2023-09-30 13:06:04,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:06:07,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:06:07,991 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 13:06:08,426 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.28 vs. limit=12.0 2023-09-30 13:06:09,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 13:06:11,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:06:11,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:06:12,856 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 13:06:12,914 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 13:06:14,842 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=721013.3333333334, ans=0.1 2023-09-30 13:06:16,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 13:06:18,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:06:22,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 13:06:24,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 13:06:26,415 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.817e+02 1.990e+02 2.367e+02 3.522e+02, threshold=3.980e+02, percent-clipped=0.0 2023-09-30 13:06:35,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 13:06:40,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 13:06:40,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:06:41,611 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 13:06:41,619 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 13:06:41,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 13:06:41,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 13:06:41,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:06:47,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 13:06:49,255 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=721146.6666666666, ans=0.2 2023-09-30 13:06:50,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:06:55,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:06:55,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 13:06:55,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:07:00,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 13:07:00,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:08,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:07:08,743 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:07:08,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:07:10,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:07:10,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:07:10,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:07:11,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:07:14,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:14,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:16,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:07:16,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:07:16,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:18,232 INFO [train.py:1039] (0/4) Epoch 21, batch 1950, loss[loss=0.1689, simple_loss=0.26, pruned_loss=0.03884, over 24438.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2515, pruned_loss=0.04935, over 4719252.44 frames. ], batch size: 69, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:07:18,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:22,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:24,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:07:25,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:25,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:07:25,452 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=721280.0, ans=0.07 2023-09-30 13:07:28,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 13:07:28,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:07:28,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:30,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:33,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:07:33,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:34,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:36,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:07:39,171 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=3.98 vs. limit=15.0 2023-09-30 13:07:41,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:41,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:07:41,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:07:41,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:44,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:47,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:47,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:47,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:07:47,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 13:07:49,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:07:50,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:07:50,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:53,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:56,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:08:00,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:08:03,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:08:03,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:03,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 13:08:03,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:12,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:08:12,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:08:13,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:21,996 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:23,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:25,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:29,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:33,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:08:33,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:35,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 13:08:35,101 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:08:36,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:08:38,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 13:08:39,445 INFO [train.py:1039] (0/4) Epoch 21, batch 2000, loss[loss=0.1804, simple_loss=0.242, pruned_loss=0.05944, over 22673.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2521, pruned_loss=0.04964, over 4719724.85 frames. ], batch size: 322, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:08:39,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:08:44,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:44,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:08:44,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:46,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:08:49,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:52,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 13:08:52,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:55,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:08:57,522 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 13:08:59,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:08:59,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:09:00,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:09:02,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 13:09:02,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:03,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:04,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:05,039 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=721680.0, ans=0.07 2023-09-30 13:09:05,613 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.68 vs. limit=15.0 2023-09-30 13:09:06,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 13:09:07,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:09:08,949 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.918e+02 2.130e+02 2.425e+02 4.087e+02, threshold=4.260e+02, percent-clipped=1.0 2023-09-30 13:09:10,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 13:09:10,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:14,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:14,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:09:14,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:16,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:18,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:18,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 13:09:18,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=721746.6666666666, ans=0.125 2023-09-30 13:09:21,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 13:09:21,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:21,341 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:25,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:27,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:09:28,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:29,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:09:31,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:31,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:33,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:33,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:34,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:38,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:38,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 13:09:44,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=721880.0, ans=0.0 2023-09-30 13:09:45,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:09:45,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:48,178 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=721880.0, ans=0.0 2023-09-30 13:09:51,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:51,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:09:54,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:56,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:56,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:57,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:09:57,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:10:00,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:02,017 INFO [train.py:1039] (0/4) Epoch 21, batch 2050, loss[loss=0.1718, simple_loss=0.2258, pruned_loss=0.05884, over 19356.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2513, pruned_loss=0.04962, over 4704177.79 frames. ], batch size: 388, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:10:02,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:03,154 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.41 vs. limit=6.0 2023-09-30 13:10:05,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:10:06,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:07,157 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=721946.6666666666, ans=0.125 2023-09-30 13:10:10,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:10:11,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:10:13,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:14,826 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.35 vs. limit=22.5 2023-09-30 13:10:15,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:10:17,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 13:10:17,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:10:20,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:10:20,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:10:22,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=722013.3333333334, ans=0.125 2023-09-30 13:10:30,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:30,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:33,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 13:10:35,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:37,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 13:10:38,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:41,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:43,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:43,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:10:44,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:46,527 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:10:48,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:10:48,395 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:10:49,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:10:51,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:53,504 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:10:55,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:10:57,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:10:58,346 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.72 vs. limit=15.0 2023-09-30 13:11:02,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:02,627 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=722146.6666666666, ans=0.1 2023-09-30 13:11:07,679 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.49 vs. limit=22.5 2023-09-30 13:11:08,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:11:09,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 13:11:16,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:16,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:11:18,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:11:20,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 13:11:24,550 INFO [train.py:1039] (0/4) Epoch 21, batch 2100, loss[loss=0.1846, simple_loss=0.2526, pruned_loss=0.05832, over 23714.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2503, pruned_loss=0.04936, over 4706088.55 frames. ], batch size: 232, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:11:24,703 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 13:11:24,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:25,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:26,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:27,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:27,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 13:11:28,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 13:11:28,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:32,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:11:33,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:11:33,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:35,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:11:35,419 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 13:11:37,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:11:38,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 13:11:38,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 13:11:40,314 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=722346.6666666666, ans=0.125 2023-09-30 13:11:41,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:11:41,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:11:41,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 13:11:41,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=722346.6666666666, ans=0.0 2023-09-30 13:11:42,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:11:46,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 13:11:46,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:48,207 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=722346.6666666666, ans=0.2 2023-09-30 13:11:49,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:11:50,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:53,683 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.414e+02 1.855e+02 2.014e+02 2.188e+02 4.712e+02, threshold=4.028e+02, percent-clipped=1.0 2023-09-30 13:11:53,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:11:56,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 13:11:58,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:11:58,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 13:11:59,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 13:12:01,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:01,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 13:12:01,767 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 13:12:03,164 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 13:12:05,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:12:06,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:12:09,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:10,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:11,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:13,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:13,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 13:12:13,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:13,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:13,400 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=722480.0, ans=0.0 2023-09-30 13:12:14,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:14,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 13:12:15,209 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=722480.0, ans=0.2 2023-09-30 13:12:16,415 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 13:12:16,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 13:12:22,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:12:25,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:12:26,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 13:12:32,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:36,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:12:36,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:12:36,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:12:36,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 13:12:36,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:12:38,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:38,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:12:40,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:12:40,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:41,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 13:12:43,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 13:12:43,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:45,579 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=15.0 2023-09-30 13:12:46,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:46,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:12:47,691 INFO [train.py:1039] (0/4) Epoch 21, batch 2150, loss[loss=0.1766, simple_loss=0.2436, pruned_loss=0.05484, over 23486.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2494, pruned_loss=0.04928, over 4703530.86 frames. ], batch size: 285, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:12:47,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:12:47,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:12:52,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:12:54,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:55,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:56,307 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.20 vs. limit=15.0 2023-09-30 13:12:57,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:12:57,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:12:57,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:13:02,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:04,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:13:04,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:13:07,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:07,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 13:13:13,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:13,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:13:14,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:14,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:16,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:16,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:13:16,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:17,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:13:17,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:13:19,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 13:13:20,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:13:20,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:21,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:24,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:13:24,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:13:24,673 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=722746.6666666666, ans=0.025 2023-09-30 13:13:26,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:27,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:13:29,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:29,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 13:13:29,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:13:32,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:32,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:34,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:35,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=722813.3333333334, ans=0.125 2023-09-30 13:13:37,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:13:39,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:41,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:41,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 13:13:43,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 13:13:43,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:13:43,422 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 13:13:43,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:45,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:13:45,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 13:13:45,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:13:45,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 13:13:45,211 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 13:13:45,211 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 13:13:46,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 13:13:49,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:51,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:51,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:13:51,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:52,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:13:54,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:54,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:59,734 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten.whitening_limit, batch_count=722880.0, ans=15.0 2023-09-30 13:14:03,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:14:03,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 13:14:08,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:14:09,592 INFO [train.py:1039] (0/4) Epoch 21, batch 2200, loss[loss=0.1913, simple_loss=0.2679, pruned_loss=0.0574, over 23305.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2502, pruned_loss=0.04958, over 4706570.46 frames. ], batch size: 105, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:14:10,489 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.45 vs. limit=15.0 2023-09-30 13:14:14,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:15,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:14:15,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:16,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:14:17,156 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=722946.6666666666, ans=0.0 2023-09-30 13:14:20,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:14:20,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:14:20,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 13:14:25,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 13:14:26,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:14:28,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=723013.3333333334, ans=0.125 2023-09-30 13:14:31,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 13:14:34,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:36,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:14:36,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:14:37,873 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:14:39,236 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.806e+02 1.948e+02 2.214e+02 3.228e+02, threshold=3.896e+02, percent-clipped=0.0 2023-09-30 13:14:39,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 13:14:42,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:14:44,602 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:46,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:14:50,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:14:52,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:14:55,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:14:56,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:58,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 13:14:59,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:01,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 13:15:03,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:03,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:15:04,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:06,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:15:06,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:06,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:07,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:09,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:15:09,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:15:10,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:15:15,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:15:15,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:15:17,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:15:17,192 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 13:15:17,365 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=723213.3333333334, ans=0.2 2023-09-30 13:15:21,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:15:21,626 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 13:15:25,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:15:25,693 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 13:15:27,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:27,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:15:28,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:31,766 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 13:15:32,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=723280.0, ans=0.125 2023-09-30 13:15:33,200 INFO [train.py:1039] (0/4) Epoch 21, batch 2250, loss[loss=0.2426, simple_loss=0.2979, pruned_loss=0.09368, over 19295.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2514, pruned_loss=0.05001, over 4698886.20 frames. ], batch size: 388, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:15:33,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:15:34,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:41,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:15:41,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:15:45,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:46,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:15:47,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:50,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 13:15:50,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:50,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:15:52,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 13:15:54,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:54,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:57,499 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:16:03,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:04,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:16:04,977 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:16:06,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 13:16:07,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:16:09,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:16:15,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:18,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:19,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:16:19,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:16:22,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:24,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:16:28,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:16:31,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:16:37,013 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=723546.6666666666, ans=0.125 2023-09-30 13:16:38,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:16:38,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:16:38,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:16:43,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:16:46,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:16:46,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 13:16:47,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:47,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:16:50,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 13:16:53,629 INFO [train.py:1039] (0/4) Epoch 21, batch 2300, loss[loss=0.1466, simple_loss=0.2263, pruned_loss=0.03347, over 20352.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2517, pruned_loss=0.04969, over 4702983.99 frames. ], batch size: 44, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:16:53,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:16:53,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:17:03,545 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 13:17:05,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:08,406 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=723680.0, ans=0.0 2023-09-30 13:17:13,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:17:13,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:17:15,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:16,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:16,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 13:17:16,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:17:18,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:19,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:17:22,893 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.842e+02 2.058e+02 2.392e+02 4.261e+02, threshold=4.115e+02, percent-clipped=2.0 2023-09-30 13:17:23,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=723680.0, ans=0.125 2023-09-30 13:17:23,771 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.43 vs. limit=22.5 2023-09-30 13:17:24,557 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:17:27,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:17:31,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:37,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:17:38,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:41,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:17:44,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:17:47,006 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:17:50,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:51,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:17:52,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:17:52,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 13:17:56,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:17:56,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:57,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:57,087 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:17:58,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:17:58,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:17:58,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:18:00,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 13:18:00,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:18:00,050 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:00,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 13:18:06,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:18:09,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:18:09,807 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=723880.0, ans=10.0 2023-09-30 13:18:13,104 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.89 vs. limit=15.0 2023-09-30 13:18:13,897 INFO [train.py:1039] (0/4) Epoch 21, batch 2350, loss[loss=0.1937, simple_loss=0.2603, pruned_loss=0.06361, over 23710.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2528, pruned_loss=0.05017, over 4713717.35 frames. ], batch size: 212, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:18:16,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:18:16,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:18:16,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:18:17,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:18:17,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:18,806 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.52 vs. limit=15.0 2023-09-30 13:18:20,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:18:20,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 13:18:20,437 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=723946.6666666666, ans=0.125 2023-09-30 13:18:27,325 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=723946.6666666666, ans=0.2 2023-09-30 13:18:28,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:18:28,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 13:18:33,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 13:18:37,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:39,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:39,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:18:40,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 13:18:42,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:18:49,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 13:18:50,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:53,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=724080.0, ans=0.125 2023-09-30 13:18:54,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:18:54,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:56,519 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:18:58,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 13:18:59,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:19:01,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:19:01,642 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:03,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:19:04,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:19:07,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 13:19:07,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:19:12,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:19:12,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:19:13,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 13:19:14,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:19:17,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 13:19:17,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:19:20,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 13:19:21,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=724213.3333333334, ans=0.125 2023-09-30 13:19:26,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 13:19:26,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:26,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:19:26,184 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 13:19:26,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=724213.3333333334, ans=0.125 2023-09-30 13:19:27,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 13:19:29,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 13:19:34,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:19:37,905 INFO [train.py:1039] (0/4) Epoch 21, batch 2400, loss[loss=0.1685, simple_loss=0.2224, pruned_loss=0.05732, over 19628.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.252, pruned_loss=0.05015, over 4719997.09 frames. ], batch size: 389, lr: 4.90e-03, grad_scale: 32.0 2023-09-30 13:19:38,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:19:41,308 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:19:41,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=724280.0, ans=0.2 2023-09-30 13:19:44,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:19:45,825 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 13:19:45,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 13:19:54,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:19:54,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:19:55,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 13:19:55,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:19:55,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:19:57,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 13:20:02,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:06,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 13:20:07,665 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.817e+02 2.030e+02 2.238e+02 3.635e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 13:20:12,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:20:12,863 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=724413.3333333334, ans=0.125 2023-09-30 13:20:17,129 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 13:20:19,183 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=724413.3333333334, ans=0.5 2023-09-30 13:20:20,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:20,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:24,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:25,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 13:20:26,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:20:27,052 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=724480.0, ans=0.0 2023-09-30 13:20:34,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:35,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:20:39,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:20:41,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:20:41,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:20:41,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:20:41,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:41,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:20:41,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:20:44,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=724546.6666666666, ans=0.125 2023-09-30 13:20:46,379 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=724546.6666666666, ans=0.2 2023-09-30 13:20:47,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:20:47,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:20:47,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 13:20:49,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 13:20:52,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:52,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:52,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 13:20:53,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 13:20:53,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 13:20:53,906 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 13:20:55,371 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 13:20:55,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:58,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:58,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:20:58,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=724613.3333333334, ans=0.0 2023-09-30 13:20:59,987 INFO [train.py:1039] (0/4) Epoch 21, batch 2450, loss[loss=0.1789, simple_loss=0.2647, pruned_loss=0.04657, over 23623.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2511, pruned_loss=0.05021, over 4711305.84 frames. ], batch size: 85, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:21:00,125 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 13:21:00,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:02,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:21:04,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:21:06,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:21:09,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:09,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:11,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 13:21:18,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:21:18,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:21,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:21:21,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:21:21,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:21:22,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 13:21:27,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:29,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:21:30,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:21:33,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:21:33,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:36,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:36,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:36,758 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=724746.6666666666, ans=0.125 2023-09-30 13:21:38,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 13:21:40,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:21:40,508 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=724746.6666666666, ans=0.1 2023-09-30 13:21:48,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:50,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:51,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:21:52,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:21:52,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:53,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:21:54,125 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=724813.3333333334, ans=0.125 2023-09-30 13:21:55,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 13:21:58,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:58,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:22:02,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:02,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:03,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=724813.3333333334, ans=0.125 2023-09-30 13:22:06,970 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.69 vs. limit=15.0 2023-09-30 13:22:09,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:22:09,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 13:22:10,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:22:10,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:10,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 13:22:10,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:22:14,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:22:16,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:22:19,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:22:19,796 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=724880.0, ans=0.125 2023-09-30 13:22:20,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:22:22,992 INFO [train.py:1039] (0/4) Epoch 21, batch 2500, loss[loss=0.1637, simple_loss=0.2441, pruned_loss=0.0416, over 24289.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2498, pruned_loss=0.04944, over 4704856.49 frames. ], batch size: 61, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:22:24,603 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 13:22:25,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:22:31,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:32,373 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.48 vs. limit=15.0 2023-09-30 13:22:39,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:22:39,688 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=725013.3333333334, ans=0.125 2023-09-30 13:22:40,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:42,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:42,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 13:22:49,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:22:49,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:51,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:22:51,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:22:52,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 13:22:54,205 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 2.019e+02 2.359e+02 2.829e+02 4.327e+02, threshold=4.718e+02, percent-clipped=1.0 2023-09-30 13:22:54,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:54,602 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=725080.0, ans=0.125 2023-09-30 13:22:56,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:56,357 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 13:22:56,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:57,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 13:22:57,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:03,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:23:04,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:23:07,620 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:23:09,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 13:23:09,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:09,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:13,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:18,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:19,148 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=725146.6666666666, ans=0.1 2023-09-30 13:23:21,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:28,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:23:30,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 13:23:30,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:23:32,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:23:33,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:23:33,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:23:33,868 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:23:33,893 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=725213.3333333334, ans=0.2 2023-09-30 13:23:36,016 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 13:23:36,017 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 13:23:36,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 13:23:38,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:41,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 13:23:41,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 13:23:41,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:44,788 INFO [train.py:1039] (0/4) Epoch 21, batch 2550, loss[loss=0.1787, simple_loss=0.2662, pruned_loss=0.04558, over 24544.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2497, pruned_loss=0.04918, over 4708589.36 frames. ], batch size: 71, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:23:44,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 13:23:46,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 13:23:49,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:51,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:51,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:23:51,648 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=725280.0, ans=0.07 2023-09-30 13:23:54,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:56,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 13:23:56,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:24:00,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 13:24:00,277 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=725346.6666666666, ans=0.1 2023-09-30 13:24:03,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:24:05,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:05,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:24:06,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 13:24:06,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:08,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:08,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:11,747 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:24:11,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 13:24:11,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:24:11,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:11,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 13:24:13,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=725346.6666666666, ans=0.95 2023-09-30 13:24:24,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:24:32,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:32,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:32,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:32,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:24:38,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=725480.0, ans=0.125 2023-09-30 13:24:39,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:41,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:41,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:24:43,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:24:43,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:24:43,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:24:46,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:48,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:51,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:24:51,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 13:24:51,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:24:51,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:52,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:24:54,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:24:55,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:04,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:25:05,032 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=725546.6666666666, ans=0.125 2023-09-30 13:25:07,611 INFO [train.py:1039] (0/4) Epoch 21, batch 2600, loss[loss=0.1857, simple_loss=0.2601, pruned_loss=0.05562, over 23930.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2503, pruned_loss=0.04902, over 4721890.02 frames. ], batch size: 86, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:25:07,696 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:09,349 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 13:25:12,751 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 13:25:12,776 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:25:12,825 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 13:25:12,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 13:25:13,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=725613.3333333334, ans=0.125 2023-09-30 13:25:14,420 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 13:25:16,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:25:16,263 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 13:25:17,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 13:25:18,698 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.79 vs. limit=15.0 2023-09-30 13:25:19,303 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 13:25:21,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:25:24,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 13:25:25,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 13:25:25,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:25:26,326 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=725680.0, ans=0.125 2023-09-30 13:25:27,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 13:25:29,542 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 13:25:29,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 13:25:32,113 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=15.0 2023-09-30 13:25:36,500 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=725680.0, ans=0.2 2023-09-30 13:25:37,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:37,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:37,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:37,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 13:25:39,010 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.842e+02 2.050e+02 2.225e+02 3.222e+02, threshold=4.100e+02, percent-clipped=0.0 2023-09-30 13:25:42,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:25:49,010 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 13:25:53,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:55,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:55,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 13:25:56,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:25:56,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:57,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 13:26:00,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:26:00,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:26:02,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:06,771 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.75 vs. limit=22.5 2023-09-30 13:26:07,513 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 13:26:07,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:07,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:26:15,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:26:15,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:26:15,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 13:26:17,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:26:19,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:21,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:21,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=725880.0, ans=0.125 2023-09-30 13:26:25,957 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=725880.0, ans=0.04949747468305833 2023-09-30 13:26:27,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 13:26:27,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:27,635 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=725880.0, ans=0.0 2023-09-30 13:26:28,814 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:26:30,214 INFO [train.py:1039] (0/4) Epoch 21, batch 2650, loss[loss=0.1612, simple_loss=0.2428, pruned_loss=0.0398, over 24481.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2515, pruned_loss=0.04962, over 4710363.32 frames. ], batch size: 66, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:26:33,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 13:26:33,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:34,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:26:35,506 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 13:26:36,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:26:40,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:42,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:26:45,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:46,938 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:48,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 13:26:48,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:26:48,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:26:50,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 13:26:51,289 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=726013.3333333334, ans=0.0 2023-09-30 13:26:52,679 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=726013.3333333334, ans=0.0 2023-09-30 13:26:54,464 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 13:26:56,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:59,210 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 13:26:59,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:00,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 13:27:05,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:05,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:27:06,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:06,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:13,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 13:27:13,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 13:27:16,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:27:19,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 13:27:19,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:21,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:21,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:22,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:23,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:25,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:26,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:28,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:27:29,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:27:29,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:27:31,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:32,718 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.40 vs. limit=22.5 2023-09-30 13:27:33,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:27:33,737 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=726146.6666666666, ans=0.125 2023-09-30 13:27:34,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:36,882 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=726213.3333333334, ans=0.125 2023-09-30 13:27:37,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:37,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:27:38,386 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=726213.3333333334, ans=0.0 2023-09-30 13:27:39,781 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=726213.3333333334, ans=0.07 2023-09-30 13:27:40,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:41,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:27:41,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:41,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 13:27:46,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:46,346 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:47,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:49,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:51,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:51,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:52,794 INFO [train.py:1039] (0/4) Epoch 21, batch 2700, loss[loss=0.1746, simple_loss=0.2351, pruned_loss=0.05707, over 23292.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2516, pruned_loss=0.04985, over 4717671.20 frames. ], batch size: 285, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:27:55,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:27:55,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 13:27:59,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:28:01,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:28:02,827 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=726280.0, ans=0.125 2023-09-30 13:28:04,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:28:04,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:04,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:04,424 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=726280.0, ans=0.125 2023-09-30 13:28:05,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:28:05,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:05,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:28:05,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:28:05,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 13:28:07,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:28:10,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:28:10,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:28:10,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:28:15,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:28:16,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 13:28:16,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:28:21,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:28:21,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:28:23,277 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.895e+02 2.064e+02 2.321e+02 3.195e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 13:28:27,196 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:28:27,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:28:27,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:28:27,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:28:31,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:34,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:34,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:28:34,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:28:39,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:39,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:28:44,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=726480.0, ans=0.0 2023-09-30 13:28:47,445 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=726480.0, ans=0.125 2023-09-30 13:28:48,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:28:48,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:51,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:28:51,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:28:54,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=726480.0, ans=0.02 2023-09-30 13:28:54,654 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.80 vs. limit=10.0 2023-09-30 13:28:56,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=726480.0, ans=22.5 2023-09-30 13:28:57,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:58,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:58,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:58,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:01,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:29:01,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:01,295 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=726546.6666666666, ans=0.0 2023-09-30 13:29:04,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:29:05,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:05,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:10,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 13:29:12,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:12,756 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=726546.6666666666, ans=0.0 2023-09-30 13:29:15,390 INFO [train.py:1039] (0/4) Epoch 21, batch 2750, loss[loss=0.1637, simple_loss=0.2444, pruned_loss=0.04155, over 24499.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.251, pruned_loss=0.04997, over 4714668.89 frames. ], batch size: 63, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:29:15,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:29:15,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 13:29:15,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 13:29:15,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:20,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:21,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:23,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:23,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:29:23,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:29,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:29:29,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:29:29,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:29:30,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:30,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 13:29:30,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:29:30,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:36,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 13:29:37,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:29:38,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:39,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:39,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:29:41,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:42,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:29:42,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:42,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:47,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:29:47,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:29:49,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:29:51,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:52,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:30:00,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:03,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:30:03,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:05,738 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=726813.3333333334, ans=0.04949747468305833 2023-09-30 13:30:05,740 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=726813.3333333334, ans=0.0 2023-09-30 13:30:08,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:30:08,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:30:08,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:30:11,343 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=726813.3333333334, ans=0.125 2023-09-30 13:30:15,520 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:30:15,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:30:15,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 13:30:22,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:22,399 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=726880.0, ans=0.125 2023-09-30 13:30:23,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 13:30:23,864 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=726880.0, ans=0.2 2023-09-30 13:30:26,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:30:29,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:30:29,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 13:30:31,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:30:32,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:30:34,957 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 13:30:35,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:30:37,941 INFO [train.py:1039] (0/4) Epoch 21, batch 2800, loss[loss=0.1874, simple_loss=0.2518, pruned_loss=0.06152, over 23773.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.249, pruned_loss=0.04993, over 4688871.42 frames. ], batch size: 179, lr: 4.89e-03, grad_scale: 32.0 2023-09-30 13:30:38,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:30:39,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:30:39,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:30:41,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 13:30:41,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:42,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:44,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:44,243 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 13:30:44,245 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 13:30:47,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:49,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:30:49,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:30:49,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=726946.6666666666, ans=0.04949747468305833 2023-09-30 13:30:52,109 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=726946.6666666666, ans=0.125 2023-09-30 13:30:53,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:56,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 13:30:57,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:30:59,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 13:31:00,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:00,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:31:02,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:05,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:05,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:05,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:31:07,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:10,735 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.859e+02 2.240e+02 2.757e+02 3.972e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-30 13:31:14,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=727080.0, ans=0.0 2023-09-30 13:31:15,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:31:17,125 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:31:21,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:22,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:31:24,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:28,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:28,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 13:31:28,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:29,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:29,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:31:34,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:35,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:37,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:40,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:31:40,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:40,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:31:42,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:31:42,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:31:44,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:44,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 13:31:44,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:46,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:46,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:47,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 13:31:47,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:47,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:31:48,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:31:48,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 13:31:56,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:57,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:31:58,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:32:01,343 INFO [train.py:1039] (0/4) Epoch 21, batch 2850, loss[loss=0.1575, simple_loss=0.2375, pruned_loss=0.0388, over 24717.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2487, pruned_loss=0.04991, over 4697752.35 frames. ], batch size: 65, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:32:01,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:03,605 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.79 vs. limit=22.5 2023-09-30 13:32:06,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:06,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:06,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:32:06,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=727280.0, ans=0.0 2023-09-30 13:32:09,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:09,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:32:10,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:32:12,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 13:32:20,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 13:32:20,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:21,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 13:32:23,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:25,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 13:32:27,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 13:32:29,258 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:33,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=727413.3333333334, ans=0.125 2023-09-30 13:32:39,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=727413.3333333334, ans=0.1 2023-09-30 13:32:40,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:42,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:42,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:43,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:32:43,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:32:43,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:32:45,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:32:45,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 13:32:47,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:32:47,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:32:48,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:48,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:52,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:52,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:52,381 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=727480.0, ans=0.1 2023-09-30 13:32:53,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:55,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:57,478 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:32:58,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:59,541 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.77 vs. limit=10.0 2023-09-30 13:33:00,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:03,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:33:07,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=727546.6666666666, ans=0.0 2023-09-30 13:33:08,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:33:10,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 13:33:10,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 13:33:11,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:33:13,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:13,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 13:33:13,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:33:14,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:14,880 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:14,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:33:14,929 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 13:33:16,416 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 13:33:16,421 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:16,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:21,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:21,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:22,474 INFO [train.py:1039] (0/4) Epoch 21, batch 2900, loss[loss=0.1443, simple_loss=0.2258, pruned_loss=0.03139, over 24551.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2485, pruned_loss=0.04921, over 4702487.36 frames. ], batch size: 60, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:33:22,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:33:24,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 13:33:26,599 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=15.0 2023-09-30 13:33:28,361 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.20 vs. limit=22.5 2023-09-30 13:33:29,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:29,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 13:33:30,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 13:33:32,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:33:32,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:33:34,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:36,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:33:40,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:40,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:43,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:33:43,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 13:33:44,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:33:45,275 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=727680.0, ans=0.95 2023-09-30 13:33:46,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:48,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 13:33:48,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 13:33:51,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:51,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 13:33:51,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:33:54,190 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:33:54,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:55,598 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.882e+02 2.103e+02 2.412e+02 3.503e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-30 13:33:57,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:58,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:34:04,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:34:07,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:09,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 13:34:09,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 13:34:09,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:34:13,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:34:15,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 13:34:16,680 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:34:21,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:34:28,443 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.32 vs. limit=12.0 2023-09-30 13:34:32,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:34:32,045 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:34:32,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 13:34:37,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:39,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 13:34:39,119 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:39,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:34:45,996 INFO [train.py:1039] (0/4) Epoch 21, batch 2950, loss[loss=0.1669, simple_loss=0.244, pruned_loss=0.04487, over 23459.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2498, pruned_loss=0.04923, over 4704385.57 frames. ], batch size: 134, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:34:46,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:47,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 13:34:47,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:34:47,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:49,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:34:52,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:34:53,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 13:34:54,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 13:34:55,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:34:55,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:35:02,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:04,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:05,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:05,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=728013.3333333334, ans=0.1 2023-09-30 13:35:07,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:12,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:12,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:35:14,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:35:17,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 13:35:18,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=728080.0, ans=0.05 2023-09-30 13:35:23,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 13:35:23,909 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 13:35:25,908 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:35:27,438 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 13:35:29,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 13:35:29,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:29,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:29,141 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 13:35:29,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:35:32,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 13:35:33,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:33,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:35:35,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:38,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:35:38,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:38,517 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 13:35:38,588 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:40,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 13:35:45,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:46,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:35:47,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 13:35:47,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:35:48,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=728146.6666666666, ans=0.07 2023-09-30 13:35:49,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 13:35:52,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:35:52,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:54,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:35:55,897 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:55,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:35:58,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:35:59,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:59,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:35:59,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:36:00,018 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=728213.3333333334, ans=0.125 2023-09-30 13:36:01,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:36:01,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:36:02,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:02,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 13:36:04,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:06,213 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:36:06,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:36:09,253 INFO [train.py:1039] (0/4) Epoch 21, batch 3000, loss[loss=0.1817, simple_loss=0.246, pruned_loss=0.05875, over 23598.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2502, pruned_loss=0.04941, over 4696531.55 frames. ], batch size: 256, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:36:09,254 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 13:36:24,052 INFO [train.py:1071] (0/4) Epoch 21, validation: loss=0.3084, simple_loss=0.2796, pruned_loss=0.1686, over 1125622.00 frames. 2023-09-30 13:36:24,053 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20954MB 2023-09-30 13:36:25,696 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 13:36:25,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 13:36:30,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:36:30,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:36:30,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 13:36:32,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:36:33,459 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.00 vs. limit=6.0 2023-09-30 13:36:38,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:36:41,388 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.03 vs. limit=15.0 2023-09-30 13:36:47,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:36:54,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 13:36:54,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:36:58,418 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.897e+02 2.038e+02 2.302e+02 2.961e+02, threshold=4.076e+02, percent-clipped=0.0 2023-09-30 13:36:59,507 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:37:00,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:37:00,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:37:00,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:04,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:04,199 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 13:37:05,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 13:37:08,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:37:08,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:37:11,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:37:11,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:12,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:12,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:37:16,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=728480.0, ans=0.035 2023-09-30 13:37:17,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:37:18,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:18,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:37:20,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:22,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 13:37:24,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:37:24,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:24,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:37:26,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:28,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:29,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:37:29,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 13:37:31,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:37:31,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 13:37:31,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:37:35,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 13:37:39,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:37:40,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:37:40,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 13:37:40,795 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 13:37:40,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:37:40,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:37:42,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:42,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:37:42,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:43,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:37:45,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 13:37:46,995 INFO [train.py:1039] (0/4) Epoch 21, batch 3050, loss[loss=0.1797, simple_loss=0.2474, pruned_loss=0.05607, over 23523.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2508, pruned_loss=0.0496, over 4699653.82 frames. ], batch size: 134, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:37:47,285 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:37:50,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:50,408 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=728613.3333333334, ans=0.0 2023-09-30 13:37:51,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:37:56,228 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:59,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 13:37:59,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=728613.3333333334, ans=0.125 2023-09-30 13:38:04,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 13:38:06,216 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 13:38:06,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:11,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:38:16,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:16,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:16,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:19,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:20,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:38:20,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:22,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:22,328 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:23,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:25,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:27,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:27,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 13:38:28,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:28,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:38:33,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:38:34,059 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.52 vs. limit=22.5 2023-09-30 13:38:34,772 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:38:34,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:38:34,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:40,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:41,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:48,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:49,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:38:49,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:51,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:52,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:38:52,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:54,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 13:38:55,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:55,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:57,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 13:38:58,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:06,246 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:07,747 INFO [train.py:1039] (0/4) Epoch 21, batch 3100, loss[loss=0.1703, simple_loss=0.2559, pruned_loss=0.04234, over 24462.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2505, pruned_loss=0.04971, over 4711286.56 frames. ], batch size: 69, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:39:09,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:39:10,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:39:13,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 13:39:14,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 13:39:16,917 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.37 vs. limit=22.5 2023-09-30 13:39:17,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 13:39:17,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:39:21,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:39:21,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:22,672 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.63 vs. limit=22.5 2023-09-30 13:39:25,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:39:30,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:34,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 13:39:39,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:39:39,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:39,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:39:39,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:39:40,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:39:42,283 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.906e+02 2.121e+02 2.536e+02 4.254e+02, threshold=4.242e+02, percent-clipped=1.0 2023-09-30 13:39:43,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:39:43,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 13:39:43,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:39:45,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:47,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 13:39:48,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:39:52,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:39:54,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 13:39:56,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 13:39:56,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:57,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:01,261 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:02,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:02,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:40:04,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:40:04,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:40:05,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:40:05,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:05,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:05,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 13:40:12,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:40:12,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 13:40:15,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:40:15,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 13:40:15,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:16,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:16,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 13:40:28,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 13:40:28,993 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=729280.0, ans=0.0 2023-09-30 13:40:30,073 INFO [train.py:1039] (0/4) Epoch 21, batch 3150, loss[loss=0.1589, simple_loss=0.2048, pruned_loss=0.05646, over 19454.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2485, pruned_loss=0.04967, over 4686187.26 frames. ], batch size: 388, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:40:30,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:32,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:34,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:40:36,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:40:36,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 13:40:37,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:37,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:40:38,573 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.76 vs. limit=6.0 2023-09-30 13:40:39,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 13:40:40,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:42,816 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 13:40:44,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 13:40:44,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:40:45,853 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 13:40:45,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:40:47,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 13:40:48,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 13:40:48,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 13:40:49,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:49,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:50,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:50,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 13:40:52,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:57,914 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:41:02,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 13:41:02,834 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:41:07,122 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:41:07,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:41:07,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 13:41:10,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 13:41:11,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:41:12,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:41:12,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:41:13,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:13,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:41:13,930 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=729413.3333333334, ans=0.0 2023-09-30 13:41:15,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:41:15,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:41:16,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 13:41:18,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:41:18,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:19,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:41:19,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:41:21,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 13:41:21,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=729480.0, ans=0.025 2023-09-30 13:41:22,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:24,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 13:41:24,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:26,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 13:41:26,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 13:41:28,152 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:41:29,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:31,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 13:41:31,147 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:41:32,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:35,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:41:37,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:37,598 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:41:39,963 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=729546.6666666666, ans=0.125 2023-09-30 13:41:42,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:41:43,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:44,773 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=729546.6666666666, ans=0.125 2023-09-30 13:41:45,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 13:41:51,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:41:51,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:41:53,359 INFO [train.py:1039] (0/4) Epoch 21, batch 3200, loss[loss=0.1622, simple_loss=0.2502, pruned_loss=0.03707, over 24656.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2474, pruned_loss=0.04872, over 4687908.77 frames. ], batch size: 65, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:41:56,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:58,021 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:41:58,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 13:42:00,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:42:00,887 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=729613.3333333334, ans=0.1 2023-09-30 13:42:05,467 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:42:06,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:42:10,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:42:18,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:42:28,690 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.936e+02 2.183e+02 2.546e+02 4.680e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 13:42:28,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 13:42:30,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:42:34,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 13:42:35,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:42:40,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:42:40,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:42:41,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:42:47,385 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 13:42:48,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:42:49,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=729813.3333333334, ans=0.125 2023-09-30 13:42:50,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 13:42:54,044 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 13:42:55,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:43:01,947 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:43:03,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,445 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 13:43:03,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:43:06,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:08,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 13:43:10,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 13:43:10,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 13:43:13,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 13:43:14,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:43:16,201 INFO [train.py:1039] (0/4) Epoch 21, batch 3250, loss[loss=0.1818, simple_loss=0.249, pruned_loss=0.05729, over 23464.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2481, pruned_loss=0.04877, over 4700897.73 frames. ], batch size: 285, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:43:17,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:43:17,781 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 13:43:17,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:19,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:19,428 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 13:43:23,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:43:26,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:43:34,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:43:34,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 13:43:36,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:36,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:36,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:39,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:39,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:43:42,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:42,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:43:42,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:44,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:43:47,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:49,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:50,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:50,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:52,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:54,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:54,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:43:59,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 13:44:00,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:44:00,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:44:02,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:02,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:44:02,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=730080.0, ans=0.07 2023-09-30 13:44:08,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:44:14,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:16,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:16,011 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 13:44:16,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:44:16,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:44:16,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:19,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 13:44:19,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 13:44:21,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:44:21,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:22,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:22,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:44:24,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:27,456 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.89 vs. limit=15.0 2023-09-30 13:44:28,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:44:28,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:29,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 13:44:29,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:29,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=730213.3333333334, ans=0.0 2023-09-30 13:44:33,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:44:33,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 13:44:37,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:37,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 13:44:38,542 INFO [train.py:1039] (0/4) Epoch 21, batch 3300, loss[loss=0.2048, simple_loss=0.2882, pruned_loss=0.06068, over 24340.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2495, pruned_loss=0.04883, over 4714148.03 frames. ], batch size: 77, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:44:38,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 13:44:40,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 13:44:41,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:46,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:47,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:44:47,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:48,309 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=730280.0, ans=0.0 2023-09-30 13:44:50,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:44:50,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:44:54,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:56,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:45:01,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 13:45:01,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:01,195 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:03,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:03,351 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 13:45:03,607 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=730346.6666666666, ans=0.0 2023-09-30 13:45:04,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:06,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:45:07,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:45:07,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:09,817 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 13:45:13,412 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.837e+02 2.019e+02 2.374e+02 3.048e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 13:45:13,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:13,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:45:16,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:16,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 13:45:18,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 13:45:18,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:19,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:45:22,536 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 13:45:24,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 13:45:25,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:45:27,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 13:45:28,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:45:32,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:45:34,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:45:35,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:37,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:37,207 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:37,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:45:40,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:45:40,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:42,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:45:43,851 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 13:45:45,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 13:45:47,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:45:47,853 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:45:47,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:49,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:49,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:49,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=730546.6666666666, ans=0.05 2023-09-30 13:45:50,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:45:51,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:52,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:45:53,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:54,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:45:54,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=730546.6666666666, ans=0.125 2023-09-30 13:45:56,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 13:45:57,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:58,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:59,868 INFO [train.py:1039] (0/4) Epoch 21, batch 3350, loss[loss=0.1652, simple_loss=0.2468, pruned_loss=0.04186, over 23221.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2505, pruned_loss=0.04934, over 4716407.96 frames. ], batch size: 93, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:46:01,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:46:01,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:46:04,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:04,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:46:04,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:09,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:46:11,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:12,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:46:13,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:16,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:46:18,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:20,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:46:21,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 13:46:23,866 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 13:46:23,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:27,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 13:46:27,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 13:46:28,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:46:28,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:46:31,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:31,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 13:46:31,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:33,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:46:34,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:36,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:37,066 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.45 vs. limit=10.0 2023-09-30 13:46:37,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:37,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:46:40,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:43,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:43,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:46,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:46:48,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:51,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:51,315 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:52,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:56,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 13:46:56,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:46:56,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 13:46:56,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:46:58,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 13:46:58,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:00,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:47:07,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:09,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 13:47:09,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=730880.0, ans=0.0 2023-09-30 13:47:10,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:12,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:47:12,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:47:13,359 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.12 vs. limit=15.0 2023-09-30 13:47:17,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:20,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 13:47:20,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:47:20,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:47:22,731 INFO [train.py:1039] (0/4) Epoch 21, batch 3400, loss[loss=0.1645, simple_loss=0.2429, pruned_loss=0.04305, over 20868.00 frames. ], tot_loss[loss=0.175, simple_loss=0.251, pruned_loss=0.04948, over 4709924.96 frames. ], batch size: 45, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:47:24,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:25,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 13:47:27,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:27,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 13:47:29,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,617 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:47:31,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:47:31,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 13:47:36,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 13:47:36,438 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 13:47:36,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:47:40,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:40,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:42,426 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:47:43,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:47:49,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:47:50,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 13:47:58,456 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 2.005e+02 2.277e+02 4.714e+02, threshold=4.010e+02, percent-clipped=1.0 2023-09-30 13:47:58,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:47:58,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:00,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:00,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:48:07,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:48:12,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 13:48:16,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:16,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:17,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 13:48:17,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:18,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:19,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:48:19,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:48:22,739 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.48 vs. limit=15.0 2023-09-30 13:48:23,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:26,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:48:26,899 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:48:27,599 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=7.03 vs. limit=12.0 2023-09-30 13:48:30,686 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:32,505 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=731213.3333333334, ans=0.125 2023-09-30 13:48:34,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 13:48:41,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:48:45,281 INFO [train.py:1039] (0/4) Epoch 21, batch 3450, loss[loss=0.159, simple_loss=0.2355, pruned_loss=0.04126, over 24574.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2514, pruned_loss=0.0498, over 4709583.77 frames. ], batch size: 60, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:48:45,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 13:48:50,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 13:48:50,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:51,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:48:51,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 13:48:53,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:55,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:49:01,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:49:01,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:03,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:49:03,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:05,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:11,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 13:49:18,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 13:49:20,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:49:20,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:49:21,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:26,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 13:49:26,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:49:29,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=731413.3333333334, ans=0.0 2023-09-30 13:49:32,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:49:32,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:49:33,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:49:36,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:49:38,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=731480.0, ans=0.125 2023-09-30 13:49:39,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 13:49:39,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:49:41,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:44,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:49:45,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 13:49:49,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:49:55,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:49:57,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:58,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:03,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:03,528 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:50:03,733 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=731546.6666666666, ans=0.0 2023-09-30 13:50:04,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:50:05,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:50:05,507 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=731613.3333333334, ans=0.125 2023-09-30 13:50:07,250 INFO [train.py:1039] (0/4) Epoch 21, batch 3500, loss[loss=0.1458, simple_loss=0.2233, pruned_loss=0.03413, over 24347.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2508, pruned_loss=0.04939, over 4717280.73 frames. ], batch size: 56, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:50:10,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:13,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:50:13,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 13:50:15,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:50:18,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 13:50:21,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:21,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 13:50:28,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:50:29,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:50:29,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:50:29,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:29,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:50:31,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:31,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:31,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 13:50:34,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:35,920 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:50:37,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:41,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:43,334 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.924e+02 2.163e+02 2.586e+02 4.135e+02, threshold=4.325e+02, percent-clipped=1.0 2023-09-30 13:50:43,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 13:50:43,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:46,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:47,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:50:48,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:49,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:50:51,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:51,268 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 13:50:52,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 13:50:54,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 13:50:54,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:55,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:57,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:57,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:50:59,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:51:01,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:51:07,254 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:07,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 13:51:07,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 13:51:07,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:12,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:12,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:14,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:17,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 13:51:19,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:20,702 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:51:20,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 13:51:23,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 13:51:25,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:25,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:26,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:26,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:28,334 INFO [train.py:1039] (0/4) Epoch 21, batch 3550, loss[loss=0.1822, simple_loss=0.2474, pruned_loss=0.05851, over 23791.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2497, pruned_loss=0.04902, over 4718409.17 frames. ], batch size: 212, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:51:30,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:51:38,948 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=731946.6666666666, ans=0.125 2023-09-30 13:51:41,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:43,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:51:45,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:48,648 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:51:49,106 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=732013.3333333334, ans=0.0 2023-09-30 13:51:50,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:50,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:51:50,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:51:54,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:55,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:51:57,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:57,368 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:51:57,691 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=732013.3333333334, ans=0.125 2023-09-30 13:51:58,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:52:03,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:52:03,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:52:06,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:06,877 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:52:06,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:52:08,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 13:52:08,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:10,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:12,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:52:16,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:18,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:52:18,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:21,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 13:52:21,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:52:23,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 13:52:25,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:27,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:52:27,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:52:30,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 13:52:32,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:38,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:39,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 13:52:40,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:45,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:45,526 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=732213.3333333334, ans=0.0 2023-09-30 13:52:46,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 13:52:51,308 INFO [train.py:1039] (0/4) Epoch 21, batch 3600, loss[loss=0.1644, simple_loss=0.2363, pruned_loss=0.04626, over 23720.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2492, pruned_loss=0.04856, over 4728206.32 frames. ], batch size: 149, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:52:51,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 13:52:52,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:52:54,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:52:56,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:58,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:58,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:53:02,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:03,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:04,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:53:05,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:53:06,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:07,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 13:53:10,185 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:53:11,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:16,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:19,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:21,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:53:21,573 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:21,601 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 13:53:23,099 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:24,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:26,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:53:27,436 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.948e+02 2.290e+02 2.674e+02 4.312e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-30 13:53:27,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:31,318 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:32,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:53:32,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 13:53:40,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:53:41,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:53:43,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 13:53:46,812 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:53:48,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:53:54,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:58,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=732546.6666666666, ans=0.125 2023-09-30 13:53:59,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:01,445 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.10 vs. limit=15.0 2023-09-30 13:54:04,010 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=732546.6666666666, ans=0.125 2023-09-30 13:54:05,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:54:06,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:54:06,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 13:54:06,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 13:54:08,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 13:54:10,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:54:10,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:54:12,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 13:54:13,080 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:54:14,038 INFO [train.py:1039] (0/4) Epoch 21, batch 3650, loss[loss=0.1548, simple_loss=0.2411, pruned_loss=0.03421, over 24505.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2497, pruned_loss=0.04875, over 4720195.40 frames. ], batch size: 66, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:54:14,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:14,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:54:14,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:14,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 13:54:15,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 13:54:18,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:20,281 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 13:54:25,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 13:54:26,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:54:29,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 13:54:30,787 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=732680.0, ans=0.09899494936611666 2023-09-30 13:54:31,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 13:54:35,203 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=732680.0, ans=0.1 2023-09-30 13:54:36,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:54:36,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:54:36,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:54:38,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:54:38,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:38,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=732680.0, ans=0.125 2023-09-30 13:54:40,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 13:54:42,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:54:42,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:43,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 13:54:45,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:54:45,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:54:45,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:54:48,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:54:50,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 13:54:52,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 13:54:52,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:54:53,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 13:54:55,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:54:55,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:54:58,729 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=732746.6666666666, ans=0.125 2023-09-30 13:55:01,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:55:03,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:03,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:55:06,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:55:06,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:55:09,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:55:12,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:12,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:12,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:55:16,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:55:16,564 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:16,654 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:23,781 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 13:55:26,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:26,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:27,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:55:28,487 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:29,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:55:31,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:34,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 13:55:34,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:35,803 INFO [train.py:1039] (0/4) Epoch 21, batch 3700, loss[loss=0.1711, simple_loss=0.2492, pruned_loss=0.04648, over 23319.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.251, pruned_loss=0.04922, over 4715202.62 frames. ], batch size: 105, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:55:37,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:55:41,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:41,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:55:41,684 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=732946.6666666666, ans=0.125 2023-09-30 13:55:44,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:44,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 13:55:44,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:45,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:55:45,869 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:55:49,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:55:54,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:55,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:55:56,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:55:56,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:57,639 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:56:00,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:00,890 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 13:56:08,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:56:08,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:56:10,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:56:10,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 13:56:10,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:11,736 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.839e+02 1.995e+02 2.258e+02 3.801e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 13:56:15,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:17,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 13:56:18,630 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:18,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:56:22,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:22,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:56:25,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:56:29,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:29,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 13:56:29,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:29,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 13:56:36,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:56:37,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:56:39,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:39,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 13:56:42,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:56:42,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:56:43,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:43,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:47,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:47,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 13:56:47,446 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=733213.3333333334, ans=0.1 2023-09-30 13:56:48,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 13:56:50,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:56:50,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:56:51,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:56:52,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:56:52,625 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=733213.3333333334, ans=0.125 2023-09-30 13:56:57,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:58,209 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.17 vs. limit=15.0 2023-09-30 13:56:59,037 INFO [train.py:1039] (0/4) Epoch 21, batch 3750, loss[loss=0.1594, simple_loss=0.2343, pruned_loss=0.04222, over 24482.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2512, pruned_loss=0.0493, over 4726377.71 frames. ], batch size: 58, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:56:59,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:56:59,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:02,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 13:57:04,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 13:57:07,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:57:07,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 13:57:08,983 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:57:09,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:10,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:12,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:15,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:15,749 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=733346.6666666666, ans=0.125 2023-09-30 13:57:17,264 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=733346.6666666666, ans=0.125 2023-09-30 13:57:18,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:57:18,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:57:21,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:57:24,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:24,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 13:57:25,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:27,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:27,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:32,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 13:57:36,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 13:57:37,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:38,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:40,027 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.11 vs. limit=15.0 2023-09-30 13:57:40,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:42,646 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=733413.3333333334, ans=0.2 2023-09-30 13:57:44,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:46,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:57:52,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 13:57:55,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:59,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:59,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:58:03,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=733546.6666666666, ans=0.0 2023-09-30 13:58:04,307 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:58:08,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:58:09,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:58:11,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:58:11,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=733546.6666666666, ans=0.2 2023-09-30 13:58:13,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:58:17,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:58:17,790 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=733546.6666666666, ans=0.125 2023-09-30 13:58:20,407 INFO [train.py:1039] (0/4) Epoch 21, batch 3800, loss[loss=0.1591, simple_loss=0.2433, pruned_loss=0.03748, over 24481.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2516, pruned_loss=0.04988, over 4719475.90 frames. ], batch size: 63, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:58:25,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:58:28,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:28,620 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=733613.3333333334, ans=0.05 2023-09-30 13:58:29,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:58:31,339 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 13:58:31,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=733613.3333333334, ans=0.125 2023-09-30 13:58:32,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:36,830 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:36,958 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:58:40,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:58:40,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:40,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:58:42,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:43,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:58:43,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:43,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 13:58:47,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:58:48,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:58:51,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=733680.0, ans=22.5 2023-09-30 13:58:52,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=733746.6666666666, ans=0.125 2023-09-30 13:58:53,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:55,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:58:56,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:58:58,104 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.864e+02 2.186e+02 2.612e+02 3.955e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 13:58:58,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:58:58,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:59,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:01,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:59:04,739 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=733746.6666666666, ans=0.125 2023-09-30 13:59:06,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:59:06,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 13:59:07,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:14,078 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=733813.3333333334, ans=0.1 2023-09-30 13:59:15,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:21,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:59:22,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 13:59:24,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 13:59:25,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:59:28,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:30,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:31,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 13:59:34,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 13:59:34,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 13:59:34,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:35,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=733880.0, ans=0.0 2023-09-30 13:59:36,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:41,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:59:41,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:59:43,117 INFO [train.py:1039] (0/4) Epoch 21, batch 3850, loss[loss=0.1643, simple_loss=0.2175, pruned_loss=0.05559, over 19418.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2512, pruned_loss=0.04941, over 4701589.70 frames. ], batch size: 388, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:59:48,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:59:50,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 13:59:50,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:59:52,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:52,534 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=733946.6666666666, ans=0.125 2023-09-30 13:59:55,363 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:59:59,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:01,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:00:01,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 14:00:09,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:10,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:00:15,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:15,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:00:18,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:19,081 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:00:21,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:21,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:00:23,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:24,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:26,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:26,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:00:26,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 14:00:26,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 14:00:27,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:27,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:28,654 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-09-30 14:00:30,605 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734080.0, ans=0.1 2023-09-30 14:00:31,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:31,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:31,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 14:00:32,082 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=734146.6666666666, ans=0.125 2023-09-30 14:00:34,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 14:00:36,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:39,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 14:00:40,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 14:00:46,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:47,018 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:50,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:52,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 14:00:54,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 14:00:59,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:59,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:03,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:01:03,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:01:03,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:04,750 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:04,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:01:04,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 14:01:04,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:01:06,251 INFO [train.py:1039] (0/4) Epoch 21, batch 3900, loss[loss=0.1759, simple_loss=0.2644, pruned_loss=0.04371, over 24552.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2502, pruned_loss=0.04907, over 4714321.22 frames. ], batch size: 71, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:01:07,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 14:01:07,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:07,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:09,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:01:09,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:10,149 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-09-30 14:01:11,092 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:01:11,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:11,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:01:12,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:12,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 14:01:12,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:14,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:15,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:16,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:01:17,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:22,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:22,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:23,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:01:26,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 14:01:27,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:28,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 14:01:28,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:31,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 14:01:31,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 14:01:38,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:39,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:39,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:01:41,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:01:41,402 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=734413.3333333334, ans=0.2 2023-09-30 14:01:44,191 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.861e+02 2.057e+02 2.419e+02 3.679e+02, threshold=4.115e+02, percent-clipped=0.0 2023-09-30 14:01:44,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:47,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:01:48,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:01:48,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:01:50,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:01:57,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:57,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:02:05,461 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734480.0, ans=0.1 2023-09-30 14:02:06,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:02:06,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:02:12,235 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=734546.6666666666, ans=0.125 2023-09-30 14:02:15,918 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.30 vs. limit=22.5 2023-09-30 14:02:17,980 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:18,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=734546.6666666666, ans=0.125 2023-09-30 14:02:19,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:19,911 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=734546.6666666666, ans=0.0 2023-09-30 14:02:21,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 14:02:21,322 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 14:02:22,695 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:22,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 14:02:24,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:02:25,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 14:02:29,014 INFO [train.py:1039] (0/4) Epoch 21, batch 3950, loss[loss=0.1927, simple_loss=0.2591, pruned_loss=0.06314, over 24001.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2497, pruned_loss=0.04923, over 4706879.10 frames. ], batch size: 196, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:02:32,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:02:34,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 14:02:34,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:02:38,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:02:41,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:02:45,359 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 14:02:46,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:46,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 14:02:48,822 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 14:02:48,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:51,023 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.01 vs. limit=22.5 2023-09-30 14:02:51,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:51,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:02:51,944 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:55,073 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 14:02:55,423 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=734680.0, ans=0.2 2023-09-30 14:02:56,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:02:56,961 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=734680.0, ans=0.125 2023-09-30 14:02:58,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:58,130 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:02:58,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:02:58,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=734680.0, ans=0.05 2023-09-30 14:02:59,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:03:10,169 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=734746.6666666666, ans=0.04949747468305833 2023-09-30 14:03:11,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:03:11,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:03:16,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 14:03:24,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 14:03:24,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 14:03:24,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:03:25,179 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=734813.3333333334, ans=0.125 2023-09-30 14:03:25,520 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.87 vs. limit=10.0 2023-09-30 14:03:27,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:03:35,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:03:35,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:03:37,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:03:37,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:03:37,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 14:03:40,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:03:42,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:03:47,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 14:03:52,515 INFO [train.py:1039] (0/4) Epoch 21, batch 4000, loss[loss=0.1703, simple_loss=0.2516, pruned_loss=0.04449, over 24674.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2505, pruned_loss=0.04939, over 4713571.21 frames. ], batch size: 65, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 14:03:59,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:00,002 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.52 vs. limit=10.0 2023-09-30 14:04:01,078 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=734946.6666666666, ans=0.1 2023-09-30 14:04:02,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=734946.6666666666, ans=0.125 2023-09-30 14:04:04,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734946.6666666666, ans=0.1 2023-09-30 14:04:05,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:12,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:12,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:04:13,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:13,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 14:04:15,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:04:15,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 14:04:15,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:04:15,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 14:04:18,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:21,370 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:04:21,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:04:21,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:04:21,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:21,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:04:21,706 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=735013.3333333334, ans=0.0 2023-09-30 14:04:24,811 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:04:26,272 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 14:04:28,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:04:28,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:29,851 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.867e+02 2.046e+02 2.259e+02 3.289e+02, threshold=4.093e+02, percent-clipped=0.0 2023-09-30 14:04:31,601 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 14:04:33,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:04:33,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:38,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 14:04:40,030 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:41,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:04:43,170 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 14:04:44,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:04:44,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 14:04:44,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:04:46,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:48,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:04:49,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:04:49,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:04:49,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:52,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 14:04:52,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:55,996 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 14:04:59,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:05:03,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 14:05:06,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:05:07,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:08,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:05:09,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:10,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=735213.3333333334, ans=22.5 2023-09-30 14:05:14,495 INFO [train.py:1039] (0/4) Epoch 21, batch 4050, loss[loss=0.1622, simple_loss=0.2483, pruned_loss=0.03804, over 24496.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2511, pruned_loss=0.04975, over 4707236.26 frames. ], batch size: 66, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:05:16,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:17,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:05:19,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 14:05:20,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:05:22,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:05:22,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:05:24,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:26,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:26,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=735280.0, ans=0.2 2023-09-30 14:05:29,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:32,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:05:33,683 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 14:05:35,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:05:36,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:05:38,379 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=735346.6666666666, ans=0.0 2023-09-30 14:05:41,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:43,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:46,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 14:05:47,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 14:05:48,015 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 14:05:49,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:05:57,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 14:05:57,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:06:00,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:02,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=735480.0, ans=0.125 2023-09-30 14:06:03,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:06:05,448 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:06:05,484 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:06,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.67 vs. limit=12.0 2023-09-30 14:06:08,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:06:14,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 14:06:14,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:06:16,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:17,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 14:06:21,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:28,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 14:06:30,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:06:30,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:06:31,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 14:06:31,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 14:06:31,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:35,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:06:36,982 INFO [train.py:1039] (0/4) Epoch 21, batch 4100, loss[loss=0.1546, simple_loss=0.2403, pruned_loss=0.0345, over 24465.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.252, pruned_loss=0.05012, over 4714123.88 frames. ], batch size: 63, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:06:37,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:37,120 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:06:44,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 14:06:47,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 14:06:48,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 14:06:49,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 14:06:49,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:51,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:51,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:52,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:06:53,073 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 14:06:54,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=735680.0, ans=0.0 2023-09-30 14:06:57,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:06:57,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:06:57,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:57,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:07:01,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:07:02,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:07:04,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:07:04,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 14:07:04,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:04,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:07:06,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:06,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:07:07,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 14:07:09,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:10,399 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.17 vs. limit=15.0 2023-09-30 14:07:11,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 14:07:12,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:07:16,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:16,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 14:07:18,229 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.830e+02 1.986e+02 2.444e+02 3.912e+02, threshold=3.973e+02, percent-clipped=0.0 2023-09-30 14:07:19,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:07:19,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:07:21,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:07:22,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 14:07:24,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:07:26,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:07:29,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 14:07:29,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:29,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:07:34,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:40,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:07:42,270 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=735880.0, ans=0.125 2023-09-30 14:07:43,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:45,412 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:07:47,877 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.47 vs. limit=15.0 2023-09-30 14:07:52,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:07:52,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:55,065 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:07:56,108 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:57,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:08:00,538 INFO [train.py:1039] (0/4) Epoch 21, batch 4150, loss[loss=0.1632, simple_loss=0.2271, pruned_loss=0.04961, over 22727.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2519, pruned_loss=0.05002, over 4713924.38 frames. ], batch size: 322, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:08:02,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:08:04,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:08:05,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:08:05,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:06,075 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:08:08,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 14:08:08,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:10,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 14:08:10,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 14:08:12,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 14:08:12,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:17,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:08:17,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:22,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:22,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:08:23,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:08:25,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:08:25,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:27,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:08:29,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=736013.3333333334, ans=0.0 2023-09-30 14:08:31,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:35,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:38,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 14:08:38,804 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=736080.0, ans=0.1 2023-09-30 14:08:40,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 14:08:40,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:08:42,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 14:08:42,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:08:42,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:08:44,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:08:46,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:51,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 14:08:54,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:08:55,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:08:57,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 14:08:57,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:59,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 14:09:02,073 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.25 vs. limit=22.5 2023-09-30 14:09:02,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:09:02,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:09:04,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:05,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 14:09:05,771 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:05,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:09:08,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:09:10,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 14:09:10,488 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:10,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:09:10,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:09:12,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 14:09:12,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:09:12,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:09:14,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:09:14,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:14,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 14:09:15,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:09:20,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:09:20,672 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=736280.0, ans=0.125 2023-09-30 14:09:22,187 INFO [train.py:1039] (0/4) Epoch 21, batch 4200, loss[loss=0.173, simple_loss=0.2146, pruned_loss=0.06573, over 19716.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2509, pruned_loss=0.05026, over 4698973.47 frames. ], batch size: 389, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:09:22,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 14:09:23,989 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:09:27,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:28,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:09:28,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:28,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:32,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 14:09:35,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 14:09:36,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=736280.0, ans=0.125 2023-09-30 14:09:37,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:38,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:41,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:09:45,497 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=736346.6666666666, ans=0.1 2023-09-30 14:09:46,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:09:46,966 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:09:47,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:48,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 14:09:48,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:50,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:51,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:51,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:09:53,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:09:54,062 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.48 vs. limit=15.0 2023-09-30 14:09:54,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 14:09:54,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:10:00,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:10:00,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:10:01,435 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.766e+02 1.995e+02 2.283e+02 3.415e+02, threshold=3.990e+02, percent-clipped=0.0 2023-09-30 14:10:01,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:10:04,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:10:06,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:10:06,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 14:10:07,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:09,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:10:12,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:10:14,177 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:22,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:10:23,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 14:10:26,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:33,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:10:33,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:34,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 14:10:41,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:10:42,545 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=15.13 vs. limit=15.0 2023-09-30 14:10:45,460 INFO [train.py:1039] (0/4) Epoch 21, batch 4250, loss[loss=0.1762, simple_loss=0.2508, pruned_loss=0.05076, over 18958.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2495, pruned_loss=0.05025, over 4688986.86 frames. ], batch size: 41, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:10:45,867 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=736613.3333333334, ans=0.0 2023-09-30 14:10:45,915 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=736613.3333333334, ans=0.125 2023-09-30 14:10:47,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:47,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:10:49,124 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=12.02 vs. limit=22.5 2023-09-30 14:10:52,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:56,607 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:10:56,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 14:10:56,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:11:01,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:02,876 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=736680.0, ans=0.125 2023-09-30 14:11:06,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:08,662 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.49 vs. limit=22.5 2023-09-30 14:11:09,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:09,395 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:12,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:11:12,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:13,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:16,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:16,368 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:11:17,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:19,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:11:19,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:21,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 14:11:25,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 14:11:25,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:25,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:25,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=736746.6666666666, ans=0.125 2023-09-30 14:11:26,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:26,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:11:26,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:27,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:31,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:11:32,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:11:37,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:11:39,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:40,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 14:11:41,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:11:41,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 14:11:42,631 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:11:44,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:11:45,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:45,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:46,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=736813.3333333334, ans=0.04949747468305833 2023-09-30 14:11:49,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 14:11:50,284 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.40 vs. limit=12.0 2023-09-30 14:11:51,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:11:52,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:11:56,511 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=736880.0, ans=0.2 2023-09-30 14:11:57,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:12:00,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:02,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:12:02,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:12:03,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:05,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:12:06,811 INFO [train.py:1039] (0/4) Epoch 21, batch 4300, loss[loss=0.1812, simple_loss=0.2483, pruned_loss=0.05708, over 22756.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2491, pruned_loss=0.04943, over 4705092.41 frames. ], batch size: 323, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:12:06,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:06,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 14:12:09,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:13,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:15,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:19,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:25,639 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.86 vs. limit=15.0 2023-09-30 14:12:29,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:29,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 14:12:29,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:12:33,465 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:12:33,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:12:33,537 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 14:12:35,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=737013.3333333334, ans=0.125 2023-09-30 14:12:36,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:12:38,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:12:41,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 14:12:41,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:12:42,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 14:12:44,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:12:45,600 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.887e+02 2.170e+02 2.542e+02 3.657e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 14:12:45,828 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:12:48,038 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=737080.0, ans=0.125 2023-09-30 14:12:48,290 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=737080.0, ans=0.0 2023-09-30 14:12:49,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:12:49,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:49,682 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=737080.0, ans=0.125 2023-09-30 14:12:51,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:12:52,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:12:52,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:52,903 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=737080.0, ans=0.125 2023-09-30 14:12:54,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 14:12:54,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 14:12:56,147 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=737146.6666666666, ans=0.0 2023-09-30 14:12:57,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:13:01,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:13:01,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:13:01,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 14:13:02,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 14:13:02,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 14:13:04,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:04,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 14:13:06,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 14:13:09,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:11,241 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 14:13:11,326 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:13:14,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:14,433 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:15,985 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 14:13:16,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=737213.3333333334, ans=0.125 2023-09-30 14:13:17,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:13:17,464 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:19,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:13:19,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:20,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:13:22,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:13:25,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:25,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:25,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:28,545 INFO [train.py:1039] (0/4) Epoch 21, batch 4350, loss[loss=0.1893, simple_loss=0.2596, pruned_loss=0.0595, over 23764.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2502, pruned_loss=0.04979, over 4713913.12 frames. ], batch size: 164, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:13:31,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 14:13:31,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:13:37,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:41,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:44,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:13:44,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:13:49,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:13:54,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:56,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:13:57,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:00,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:14:02,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:14:02,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:14:07,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 14:14:07,413 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=737413.3333333334, ans=0.125 2023-09-30 14:14:08,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:08,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:11,552 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=737413.3333333334, ans=0.2 2023-09-30 14:14:15,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:18,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 14:14:22,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:23,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:14:28,141 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 14:14:31,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:31,811 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:14:33,237 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 14:14:33,339 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 14:14:33,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:34,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:35,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:14:35,927 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.56 vs. limit=15.0 2023-09-30 14:14:36,497 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:38,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:38,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:38,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=737546.6666666666, ans=0.1 2023-09-30 14:14:41,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 14:14:41,234 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:41,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:41,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:41,850 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.28 vs. limit=15.0 2023-09-30 14:14:42,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 14:14:44,320 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 14:14:44,327 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 14:14:44,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 14:14:47,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:14:48,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:14:48,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:14:48,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:14:52,187 INFO [train.py:1039] (0/4) Epoch 21, batch 4400, loss[loss=0.1816, simple_loss=0.253, pruned_loss=0.05514, over 23810.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2511, pruned_loss=0.0497, over 4711180.59 frames. ], batch size: 212, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:14:52,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 14:14:53,719 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 14:14:53,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:57,860 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.26 vs. limit=15.0 2023-09-30 14:14:58,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:14:58,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:00,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:15:01,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 14:15:01,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 14:15:03,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 14:15:03,313 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 14:15:03,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:15:05,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:15:07,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 14:15:08,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:10,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:10,214 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 14:15:14,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:14,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 14:15:14,833 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 14:15:18,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 14:15:19,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 14:15:19,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 14:15:19,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:20,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:20,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=737680.0, ans=0.0 2023-09-30 14:15:21,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:21,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:22,403 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.68 vs. limit=22.5 2023-09-30 14:15:23,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 14:15:23,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 14:15:25,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:27,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:15:27,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:27,564 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=737746.6666666666, ans=0.07 2023-09-30 14:15:30,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:30,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:30,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 14:15:30,504 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:15:31,465 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.966e+02 2.237e+02 2.534e+02 3.532e+02, threshold=4.474e+02, percent-clipped=0.0 2023-09-30 14:15:31,650 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 14:15:32,098 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=737746.6666666666, ans=0.1 2023-09-30 14:15:32,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=737746.6666666666, ans=0.125 2023-09-30 14:15:33,694 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=737746.6666666666, ans=0.2 2023-09-30 14:15:34,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:43,117 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:44,762 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 14:15:49,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:15:51,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:15:52,409 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.00 vs. limit=15.0 2023-09-30 14:15:56,260 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:15:56,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 14:15:58,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:15:58,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:15:58,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:15:58,559 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:16:02,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 14:16:05,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 14:16:08,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 14:16:08,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:08,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 14:16:08,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:16:08,746 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=737880.0, ans=0.1 2023-09-30 14:16:08,754 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=737880.0, ans=0.0 2023-09-30 14:16:11,709 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:16:13,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 14:16:15,305 INFO [train.py:1039] (0/4) Epoch 21, batch 4450, loss[loss=0.1887, simple_loss=0.2754, pruned_loss=0.05101, over 24410.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2524, pruned_loss=0.05028, over 4705639.11 frames. ], batch size: 69, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:16:15,744 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=737946.6666666666, ans=0.125 2023-09-30 14:16:17,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:16:20,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:20,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:16:25,151 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:16:25,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:16:30,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:33,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:16:34,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:16:34,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:37,625 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.17 vs. limit=15.0 2023-09-30 14:16:38,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 14:16:38,458 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:38,764 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:16:39,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:40,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:16:40,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:16:41,707 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:16:47,459 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.08 vs. limit=15.0 2023-09-30 14:16:48,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:48,411 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:49,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:51,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:53,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:16:57,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:16:58,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 14:17:00,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 14:17:00,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:17:02,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:03,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 14:17:07,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:17:10,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:12,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 14:17:14,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:14,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:14,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:17:14,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:15,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:20,969 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:17:21,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 14:17:22,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:17:24,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:17:25,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:27,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:27,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:17:30,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:17:31,269 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.64 vs. limit=10.0 2023-09-30 14:17:32,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 14:17:33,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:17:37,152 INFO [train.py:1039] (0/4) Epoch 21, batch 4500, loss[loss=0.1836, simple_loss=0.2715, pruned_loss=0.04784, over 24548.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2534, pruned_loss=0.05069, over 4712300.02 frames. ], batch size: 71, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:17:39,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:41,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 14:17:41,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 14:17:43,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:17:48,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:50,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:52,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:17:52,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:17:53,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:17:54,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:18:05,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:18:06,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:18:09,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:09,818 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=738413.3333333334, ans=0.125 2023-09-30 14:18:10,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:18:11,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:18:16,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:18:18,455 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.886e+02 2.149e+02 2.495e+02 4.486e+02, threshold=4.299e+02, percent-clipped=1.0 2023-09-30 14:18:20,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:18:26,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:18:28,956 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:18:30,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 14:18:30,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=738480.0, ans=0.2 2023-09-30 14:18:31,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:32,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,912 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:35,386 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:18:36,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:18:36,645 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 14:18:36,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:18:36,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:41,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:18:41,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:18:45,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:47,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:18:47,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:18:47,633 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=738546.6666666666, ans=0.125 2023-09-30 14:18:48,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 14:18:51,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 14:18:52,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 14:18:55,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 14:18:59,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 14:18:59,813 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=738613.3333333334, ans=0.5 2023-09-30 14:18:59,857 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=738613.3333333334, ans=0.2 2023-09-30 14:19:00,953 INFO [train.py:1039] (0/4) Epoch 21, batch 4550, loss[loss=0.1844, simple_loss=0.2668, pruned_loss=0.05105, over 24360.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2522, pruned_loss=0.05039, over 4710026.40 frames. ], batch size: 77, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:19:01,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:03,596 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.03 vs. limit=15.0 2023-09-30 14:19:05,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:05,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:06,569 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.47 vs. limit=15.0 2023-09-30 14:19:08,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:09,092 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738613.3333333334, ans=0.1 2023-09-30 14:19:13,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:19:15,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:19:16,930 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:16,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:19:16,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:21,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:21,991 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:25,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:19:28,200 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 14:19:28,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 14:19:29,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:19:31,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 14:19:37,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 14:19:37,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:37,470 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=738746.6666666666, ans=0.125 2023-09-30 14:19:38,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 14:19:40,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:19:43,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:19:46,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 14:19:48,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:19:49,760 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:51,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:53,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:56,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 14:19:57,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 14:19:57,048 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:19:57,422 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:19:58,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 14:20:00,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 14:20:01,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:20:01,604 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:01,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:02,049 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738813.3333333334, ans=0.1 2023-09-30 14:20:03,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:03,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:20:05,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:20:06,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 14:20:08,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:20:08,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:20:08,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 14:20:08,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:20:09,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 14:20:13,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:20:13,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:20:13,335 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=738880.0, ans=0.125 2023-09-30 14:20:14,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=738880.0, ans=0.125 2023-09-30 14:20:16,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:20:16,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=738880.0, ans=0.0 2023-09-30 14:20:17,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:17,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:20:19,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:20:20,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:20:22,182 INFO [train.py:1039] (0/4) Epoch 21, batch 4600, loss[loss=0.1658, simple_loss=0.2397, pruned_loss=0.04599, over 23370.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2513, pruned_loss=0.04983, over 4713133.91 frames. ], batch size: 119, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:20:23,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:23,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:26,559 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.55 vs. limit=6.0 2023-09-30 14:20:27,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:20:27,490 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:20:29,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:31,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 14:20:34,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:20:34,440 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=738946.6666666666, ans=0.0 2023-09-30 14:20:37,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:20:37,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:40,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:47,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 14:20:48,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:50,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:55,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:20:55,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:21:00,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 14:21:00,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:21:02,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:03,435 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.897e+02 2.077e+02 2.452e+02 3.334e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 14:21:08,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:08,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:21:10,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:21:17,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 14:21:18,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:21:21,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:23,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:21:26,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:26,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 14:21:26,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:28,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 14:21:28,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:28,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:29,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:31,062 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:21:31,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:31,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=739213.3333333334, ans=0.125 2023-09-30 14:21:32,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 14:21:32,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 14:21:34,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 14:21:34,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:34,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:36,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:37,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:38,232 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:21:38,843 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.32 vs. limit=15.0 2023-09-30 14:21:42,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=739213.3333333334, ans=0.125 2023-09-30 14:21:44,497 INFO [train.py:1039] (0/4) Epoch 21, batch 4650, loss[loss=0.1569, simple_loss=0.2382, pruned_loss=0.03776, over 24320.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2498, pruned_loss=0.04898, over 4713480.12 frames. ], batch size: 61, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:21:48,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:21:51,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:51,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:52,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:21:52,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:52,867 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:53,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:57,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 14:22:00,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:22:02,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 14:22:02,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:22:03,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 14:22:03,661 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:22:05,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 14:22:05,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 14:22:05,194 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:05,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:22:10,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:22:11,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:11,984 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 14:22:12,681 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.67 vs. limit=22.5 2023-09-30 14:22:14,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:15,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 14:22:18,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:18,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:22:18,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 14:22:22,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:22:25,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:22:28,889 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:35,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:35,258 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=739480.0, ans=0.125 2023-09-30 14:22:36,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=739480.0, ans=0.07 2023-09-30 14:22:38,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:39,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:41,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:22:44,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 14:22:44,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 14:22:46,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 14:22:46,344 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 14:22:47,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:22:54,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:22:54,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:22:56,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 14:22:56,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:58,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:22:58,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:22:59,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:23:03,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:23:03,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:23:05,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:23:06,498 INFO [train.py:1039] (0/4) Epoch 21, batch 4700, loss[loss=0.1886, simple_loss=0.2637, pruned_loss=0.05671, over 23769.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2508, pruned_loss=0.04911, over 4723461.91 frames. ], batch size: 212, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:23:08,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:09,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:23:09,666 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:23:09,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 14:23:11,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:23:12,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 14:23:19,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:21,118 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:21,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:23:22,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:22,938 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:23:24,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:23:30,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 14:23:30,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 14:23:35,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:36,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:23:36,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:23:38,350 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=739746.6666666666, ans=0.1 2023-09-30 14:23:39,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:46,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:23:46,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:23:47,473 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.882e+02 2.021e+02 2.237e+02 3.621e+02, threshold=4.043e+02, percent-clipped=0.0 2023-09-30 14:23:48,054 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=739746.6666666666, ans=0.125 2023-09-30 14:23:48,063 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=739746.6666666666, ans=0.0 2023-09-30 14:23:49,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:55,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 14:23:56,155 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:23:59,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:04,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 14:24:05,937 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:24:08,469 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.61 vs. limit=15.0 2023-09-30 14:24:11,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:24:11,286 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=739880.0, ans=0.0 2023-09-30 14:24:12,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 14:24:14,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:14,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:15,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:24:17,304 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:24:17,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 14:24:17,690 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=739880.0, ans=0.125 2023-09-30 14:24:18,867 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 14:24:19,199 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=739880.0, ans=0.0 2023-09-30 14:24:20,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:21,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 14:24:23,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:28,520 INFO [train.py:1039] (0/4) Epoch 21, batch 4750, loss[loss=0.1678, simple_loss=0.2431, pruned_loss=0.04624, over 23654.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2515, pruned_loss=0.04966, over 4715732.98 frames. ], batch size: 149, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:24:28,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 14:24:30,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:24:31,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:24:37,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 14:24:37,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:24:42,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 14:24:45,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:24:45,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:47,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:24:52,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 14:24:58,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:24:59,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 14:24:59,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:05,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:05,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:05,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:06,509 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 14:25:06,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 14:25:11,889 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.91 vs. limit=15.0 2023-09-30 14:25:12,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 14:25:14,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:18,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:20,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:25:20,644 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 14:25:20,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:23,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:25:25,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:25:26,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 14:25:26,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 14:25:28,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:28,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:25:29,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:29,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:25:29,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 14:25:33,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 14:25:36,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:25:38,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:25:38,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=740213.3333333334, ans=0.125 2023-09-30 14:25:39,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 14:25:40,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:40,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:41,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:25:43,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:44,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:25:47,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:47,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 14:25:48,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 14:25:49,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 14:25:52,001 INFO [train.py:1039] (0/4) Epoch 21, batch 4800, loss[loss=0.185, simple_loss=0.2507, pruned_loss=0.05964, over 23773.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2527, pruned_loss=0.05028, over 4715234.16 frames. ], batch size: 164, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:25:53,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:25:53,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:55,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 14:25:55,724 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.89 vs. limit=6.0 2023-09-30 14:26:00,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:01,529 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:06,449 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:26:07,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:08,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:08,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 14:26:10,223 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=740346.6666666666, ans=0.125 2023-09-30 14:26:11,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:26:11,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:26:11,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:26:15,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:15,353 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=740346.6666666666, ans=0.125 2023-09-30 14:26:16,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:17,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:26:17,607 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=740346.6666666666, ans=0.2 2023-09-30 14:26:18,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:18,932 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:26:18,963 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:20,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:24,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:25,904 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=740413.3333333334, ans=0.125 2023-09-30 14:26:27,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:28,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:28,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:26:30,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:26:31,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:32,878 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.899e+02 2.131e+02 2.497e+02 3.417e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-30 14:26:34,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 14:26:34,511 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 14:26:35,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:36,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:26:36,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:26:36,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:36,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:26:37,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:26:38,093 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=740413.3333333334, ans=0.125 2023-09-30 14:26:39,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:43,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:26:47,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:48,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:26:49,376 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.26 vs. limit=12.0 2023-09-30 14:26:55,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 14:26:55,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:55,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:55,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:26:57,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:57,488 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=740546.6666666666, ans=0.0 2023-09-30 14:27:00,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:27:02,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:27:02,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:02,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:27:03,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:27:05,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:27:08,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:08,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:08,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:27:10,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 14:27:12,838 INFO [train.py:1039] (0/4) Epoch 21, batch 4850, loss[loss=0.1607, simple_loss=0.2416, pruned_loss=0.03993, over 24666.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2527, pruned_loss=0.0509, over 4702948.13 frames. ], batch size: 65, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:27:12,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 14:27:12,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:12,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:13,108 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:13,110 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:16,764 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:27:20,053 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=740613.3333333334, ans=0.2 2023-09-30 14:27:26,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 14:27:27,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:29,393 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=740680.0, ans=0.125 2023-09-30 14:27:32,942 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:33,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:27:34,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:38,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:39,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:27:41,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:27:41,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 14:27:46,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:47,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:27:47,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:27:49,269 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:27:49,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 14:27:52,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:52,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:54,794 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=740746.6666666666, ans=0.0 2023-09-30 14:27:56,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:56,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 14:27:56,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 14:27:57,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:28:06,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:28:07,523 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 14:28:07,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:28:07,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:28:09,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:28:10,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=740813.3333333334, ans=0.0 2023-09-30 14:28:13,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 14:28:13,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:14,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 14:28:14,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:16,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:16,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 14:28:22,780 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=740880.0, ans=0.0 2023-09-30 14:28:25,659 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=740880.0, ans=0.0 2023-09-30 14:28:27,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:32,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:28:32,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:35,181 INFO [train.py:1039] (0/4) Epoch 21, batch 4900, loss[loss=0.1577, simple_loss=0.242, pruned_loss=0.03668, over 24689.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.252, pruned_loss=0.05056, over 4698185.71 frames. ], batch size: 65, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:28:37,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 14:28:37,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:28:42,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:44,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:44,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:28:47,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 14:28:52,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 14:28:52,536 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=741013.3333333334, ans=0.125 2023-09-30 14:28:56,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 14:28:57,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 14:28:57,800 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:28:57,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:57,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:28:57,902 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:57,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:28:59,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 14:29:00,115 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.58 vs. limit=15.0 2023-09-30 14:29:03,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 14:29:05,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:29:06,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:29:08,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:29:10,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:29:12,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:12,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:12,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 14:29:15,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:29:16,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:29:16,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 14:29:16,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 14:29:17,286 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.865e+02 2.105e+02 2.513e+02 4.105e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 14:29:20,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 14:29:22,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:29:25,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:29:25,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:29:26,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=741146.6666666666, ans=0.0 2023-09-30 14:29:27,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:27,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:29:27,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:29:27,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 14:29:30,718 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:32,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:29:33,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:29:37,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 14:29:38,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:29:38,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 14:29:40,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 14:29:48,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:29:50,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:29:51,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 14:29:52,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:29:52,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:29:52,278 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=741213.3333333334, ans=0.2 2023-09-30 14:29:52,959 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.71 vs. limit=15.0 2023-09-30 14:29:53,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:58,092 INFO [train.py:1039] (0/4) Epoch 21, batch 4950, loss[loss=0.1572, simple_loss=0.2401, pruned_loss=0.03717, over 24305.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.25, pruned_loss=0.05008, over 4687375.83 frames. ], batch size: 61, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:29:58,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:29:58,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:30:00,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:30:00,171 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 14:30:01,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:30:05,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:05,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:30:08,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 14:30:08,599 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 14:30:08,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:30:10,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 14:30:10,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:10,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:30:11,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:30:12,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:13,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:15,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:30:16,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:30:18,744 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:20,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:20,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:30:23,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:30:28,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:30,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:30:31,689 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:33,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:35,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:30:35,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 14:30:35,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 14:30:39,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:42,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:30:42,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:30:43,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:30:43,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:30:45,248 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:30:48,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:49,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:30:50,072 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=741480.0, ans=0.125 2023-09-30 14:30:51,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:30:53,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:55,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:55,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 14:30:55,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:30:57,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:31:00,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:31:03,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:31:03,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:31:03,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:03,728 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:31:05,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:31:06,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:31:08,845 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:31:08,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:31:10,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 14:31:16,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:19,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 14:31:19,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:31:22,315 INFO [train.py:1039] (0/4) Epoch 21, batch 5000, loss[loss=0.1731, simple_loss=0.2611, pruned_loss=0.0426, over 24515.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.249, pruned_loss=0.04929, over 4691259.95 frames. ], batch size: 71, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:31:27,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:27,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:29,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 14:31:31,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 14:31:32,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:31:36,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 14:31:36,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:31:37,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:31:37,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 14:31:39,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:39,115 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:31:40,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 14:31:40,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:40,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:31:43,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 14:31:45,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 14:31:45,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:31:46,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 14:31:46,023 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:31:47,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:49,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:31:49,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 14:31:49,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 14:31:52,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 14:31:52,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:54,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:54,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 14:31:54,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:55,875 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:57,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:58,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:32:00,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 14:32:01,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:32:01,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:32:03,265 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.857e+02 2.049e+02 2.517e+02 4.196e+02, threshold=4.099e+02, percent-clipped=0.0 2023-09-30 14:32:06,977 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 14:32:10,139 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:32:11,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:32:11,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:14,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 14:32:16,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:32:16,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:16,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:32:17,826 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=741813.3333333334, ans=0.2 2023-09-30 14:32:19,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 14:32:19,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:21,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:23,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:32:28,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=741880.0, ans=0.05 2023-09-30 14:32:29,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 14:32:32,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:43,721 INFO [train.py:1039] (0/4) Epoch 21, batch 5050, loss[loss=0.1934, simple_loss=0.2612, pruned_loss=0.0628, over 23920.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2497, pruned_loss=0.04948, over 4699975.76 frames. ], batch size: 195, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:32:43,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:44,241 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=741946.6666666666, ans=0.2 2023-09-30 14:32:44,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=741946.6666666666, ans=0.5 2023-09-30 14:32:45,491 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:45,502 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:32:45,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:45,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:32:47,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:32:47,094 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:51,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:51,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 14:32:53,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:32:57,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:59,043 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:32:59,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 14:33:00,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:02,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:33:03,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:33:03,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:33:05,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:33:05,569 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742013.3333333334, ans=0.1 2023-09-30 14:33:15,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 14:33:16,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:33:17,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:18,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 14:33:20,040 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:20,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:20,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:20,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:33:20,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 14:33:21,790 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 14:33:23,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:26,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:29,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:29,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 14:33:31,779 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:33:33,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:36,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 14:33:37,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:33:37,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:33:37,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:33:37,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:39,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:33:42,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:33:42,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:42,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:33:42,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:33:42,725 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=742146.6666666666, ans=0.0 2023-09-30 14:33:43,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 14:33:45,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:33:47,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:51,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=742213.3333333334, ans=0.1 2023-09-30 14:33:52,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:52,647 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 14:33:52,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:33:54,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:33:54,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:54,266 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 14:33:56,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:56,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 14:33:56,034 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:00,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:01,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:01,998 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 14:34:04,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 14:34:05,790 INFO [train.py:1039] (0/4) Epoch 21, batch 5100, loss[loss=0.1766, simple_loss=0.2512, pruned_loss=0.05102, over 23174.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2501, pruned_loss=0.04954, over 4702592.21 frames. ], batch size: 105, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:34:06,146 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:34:07,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:07,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:07,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:34:10,597 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 14:34:13,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:34:14,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=742280.0, ans=0.0 2023-09-30 14:34:15,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 14:34:15,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 14:34:15,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:19,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:34:22,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:34:22,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 14:34:24,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 14:34:27,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:28,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:34:31,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:32,040 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=742346.6666666666, ans=0.125 2023-09-30 14:34:33,665 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=742346.6666666666, ans=0.0 2023-09-30 14:34:35,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 14:34:36,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:38,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:38,646 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:34:40,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:43,585 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:43,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 14:34:43,863 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=742413.3333333334, ans=0.125 2023-09-30 14:34:45,217 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 14:34:45,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:46,539 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.007e+02 2.200e+02 2.519e+02 3.504e+02, threshold=4.400e+02, percent-clipped=0.0 2023-09-30 14:34:46,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 14:34:46,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 14:34:49,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:35:01,622 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:04,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 14:35:04,704 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 14:35:04,717 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 14:35:06,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 14:35:06,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:35:07,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 14:35:13,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 14:35:15,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:35:16,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:35:19,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 14:35:21,455 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:35:22,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 14:35:28,123 INFO [train.py:1039] (0/4) Epoch 21, batch 5150, loss[loss=0.1736, simple_loss=0.2601, pruned_loss=0.04358, over 24553.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2513, pruned_loss=0.04986, over 4706491.90 frames. ], batch size: 71, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:35:28,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:35:28,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:35:28,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:35:29,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:35:31,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:35:32,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:35:33,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 14:35:33,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 14:35:34,039 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 14:35:34,074 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:35:35,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 14:35:36,296 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:36,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:35:38,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:40,046 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:44,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:35:44,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 14:35:46,828 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=742680.0, ans=0.1 2023-09-30 14:35:47,279 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=742680.0, ans=22.5 2023-09-30 14:35:47,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:48,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:35:50,358 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=742680.0, ans=0.125 2023-09-30 14:35:51,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:35:51,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:35:51,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:35:51,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:35:51,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:35:51,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 14:35:54,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:35:54,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:35:57,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:35:59,441 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 14:35:59,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:36:05,085 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=742746.6666666666, ans=0.125 2023-09-30 14:36:06,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:36:06,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 14:36:11,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:36:18,545 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.37 vs. limit=15.0 2023-09-30 14:36:20,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:20,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:24,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:24,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:26,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 14:36:32,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:36:33,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:36:33,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:36:35,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:37,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:39,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 14:36:45,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:46,338 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:36:47,715 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:47,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:36:49,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:36:49,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:36:49,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:36:50,682 INFO [train.py:1039] (0/4) Epoch 21, batch 5200, loss[loss=0.1695, simple_loss=0.2582, pruned_loss=0.04038, over 24631.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2525, pruned_loss=0.05061, over 4706471.33 frames. ], batch size: 68, lr: 4.84e-03, grad_scale: 32.0 2023-09-30 14:36:50,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:36:54,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:36:56,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:36:59,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:02,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 14:37:04,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:37:05,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:07,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:08,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:37:08,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:10,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 14:37:10,613 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=743013.3333333334, ans=0.125 2023-09-30 14:37:15,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:37:15,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:18,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 14:37:21,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:37:22,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:37:23,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 14:37:23,741 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 14:37:25,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 14:37:27,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:27,500 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 14:37:27,521 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:29,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:29,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:37:30,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 14:37:31,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:37:32,283 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.872e+02 2.039e+02 2.379e+02 3.474e+02, threshold=4.079e+02, percent-clipped=0.0 2023-09-30 14:37:33,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:35,657 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 14:37:35,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 14:37:37,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 14:37:43,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 14:37:43,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:37:49,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:37:49,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:37:51,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 14:37:51,379 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:52,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 14:37:52,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:52,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:37:56,395 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:37:56,769 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=743213.3333333334, ans=0.125 2023-09-30 14:37:57,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:38:03,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:38:04,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:04,773 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:05,563 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=15.0 2023-09-30 14:38:09,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:09,638 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 14:38:11,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:38:11,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:38:12,505 INFO [train.py:1039] (0/4) Epoch 21, batch 5250, loss[loss=0.1595, simple_loss=0.2392, pruned_loss=0.03988, over 21198.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2513, pruned_loss=0.04994, over 4711205.04 frames. ], batch size: 46, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:38:12,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:14,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:38:14,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:38:17,204 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:38:19,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=743280.0, ans=0.0 2023-09-30 14:38:20,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:20,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:38:22,197 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:38:26,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:28,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:38:30,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:38:33,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:38:35,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 14:38:35,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:37,240 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:44,561 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=743413.3333333334, ans=0.125 2023-09-30 14:38:53,121 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=743413.3333333334, ans=0.125 2023-09-30 14:39:09,940 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.48 vs. limit=15.0 2023-09-30 14:39:13,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=743546.6666666666, ans=0.0 2023-09-30 14:39:25,624 INFO [train.py:1039] (0/4) Epoch 21, batch 5300, loss[loss=0.1631, simple_loss=0.209, pruned_loss=0.05865, over 19005.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2496, pruned_loss=0.0494, over 4706030.15 frames. ], batch size: 388, lr: 4.84e-03, grad_scale: 8.0 2023-09-30 14:39:33,958 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.44 vs. limit=15.0 2023-09-30 14:39:38,811 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.63 vs. limit=15.0 2023-09-30 14:39:40,856 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/epoch-21.pt 2023-09-30 14:39:46,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:39:46,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 14:39:46,343 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 14:39:46,366 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:46,756 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:46,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:46,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:46,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:47,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:39:47,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:47,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:39:47,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:39:47,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 14:39:47,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 14:39:47,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 14:39:48,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:39:48,156 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 14:39:48,279 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 14:39:48,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:49,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:49,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:49,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:49,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:39:50,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:50,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:50,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:50,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:50,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:50,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:39:50,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:50,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:39:51,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 14:39:51,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:51,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:51,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 14:39:51,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 14:39:52,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:39:52,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:39:52,658 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 14:39:52,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 14:39:53,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:53,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:39:53,857 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:54,008 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 14:39:54,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 14:39:54,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:39:54,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:54,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 14:39:54,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 14:39:54,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 14:39:54,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:58,324 INFO [train.py:1039] (0/4) Epoch 22, batch 0, loss[loss=0.17, simple_loss=0.2434, pruned_loss=0.04825, over 23751.00 frames. ], tot_loss[loss=0.17, simple_loss=0.2434, pruned_loss=0.04825, over 23751.00 frames. ], batch size: 212, lr: 4.73e-03, grad_scale: 16.0 2023-09-30 14:39:58,325 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 14:40:11,496 INFO [train.py:1071] (0/4) Epoch 22, validation: loss=0.3042, simple_loss=0.2741, pruned_loss=0.1671, over 1125622.00 frames. 2023-09-30 14:40:11,497 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20954MB 2023-09-30 14:40:13,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 14:40:15,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:40:16,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:40:21,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:21,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:40:22,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:23,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 14:40:25,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 14:40:28,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:28,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:31,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:31,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:32,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:40:32,924 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:35,703 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.911e+02 2.208e+02 2.678e+02 6.793e+02, threshold=4.416e+02, percent-clipped=10.0 2023-09-30 14:40:35,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 14:40:37,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:46,361 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:40:46,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:48,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 14:40:54,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:40:54,681 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:40:57,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:01,581 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:41:06,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:06,248 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=743893.3333333334, ans=0.125 2023-09-30 14:41:07,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=743893.3333333334, ans=0.125 2023-09-30 14:41:10,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 14:41:13,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 14:41:15,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:15,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:15,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:41:17,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:41:19,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 14:41:21,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:24,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:29,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:41:31,604 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 14:41:33,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:41:35,339 INFO [train.py:1039] (0/4) Epoch 22, batch 50, loss[loss=0.1898, simple_loss=0.2754, pruned_loss=0.05211, over 24642.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2518, pruned_loss=0.04938, over 1057677.60 frames. ], batch size: 73, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:41:36,154 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.11 vs. limit=15.0 2023-09-30 14:41:38,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:40,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:40,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 14:41:40,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:41:40,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:41:43,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:43,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:45,680 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.18 vs. limit=15.0 2023-09-30 14:41:46,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:50,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 14:41:51,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:51,333 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=744093.3333333334, ans=0.2 2023-09-30 14:41:59,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:42:00,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 14:42:02,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 14:42:03,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:42:06,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:06,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:06,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:08,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:42:09,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:42:09,887 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:16,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:19,736 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:19,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:42:19,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 14:42:22,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:42:24,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:42:24,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 14:42:24,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:25,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 14:42:33,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=744226.6666666666, ans=0.0 2023-09-30 14:42:35,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:42:35,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:37,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:39,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:39,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:41,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 14:42:41,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 14:42:43,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:43,264 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:44,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:46,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:47,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 14:42:47,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 14:42:50,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:42:52,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:52,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:42:52,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 14:42:53,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 14:42:54,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:54,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:56,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:42:57,530 INFO [train.py:1039] (0/4) Epoch 22, batch 100, loss[loss=0.1843, simple_loss=0.2674, pruned_loss=0.0506, over 24120.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2544, pruned_loss=0.05107, over 1858627.87 frames. ], batch size: 80, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:42:57,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:43:00,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:43:00,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=744360.0, ans=0.125 2023-09-30 14:43:03,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:43:07,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:07,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 14:43:07,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:43:12,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:43:12,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:12,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:43:12,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:43:12,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:14,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 14:43:16,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:43:16,366 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=744426.6666666666, ans=0.5 2023-09-30 14:43:17,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:17,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:17,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:17,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=744426.6666666666, ans=0.09899494936611666 2023-09-30 14:43:19,439 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=744426.6666666666, ans=0.04949747468305833 2023-09-30 14:43:22,473 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.832e+02 2.020e+02 2.279e+02 4.259e+02, threshold=4.040e+02, percent-clipped=0.0 2023-09-30 14:43:22,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 14:43:22,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:23,010 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=744426.6666666666, ans=0.125 2023-09-30 14:43:24,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:25,670 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:43:27,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:43:31,637 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 14:43:31,673 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 14:43:32,057 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=744493.3333333334, ans=0.125 2023-09-30 14:43:33,286 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:43:33,287 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:43:36,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:43:38,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:38,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:41,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=744493.3333333334, ans=0.1 2023-09-30 14:43:45,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:47,262 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 14:43:48,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:43:50,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=744560.0, ans=0.0 2023-09-30 14:43:53,311 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:43:53,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:43:56,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:00,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:03,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:04,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:44:07,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:07,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:10,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:10,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:44:10,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:12,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 14:44:12,193 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 14:44:12,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:14,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:44:14,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:14,424 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:14,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:44:14,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:44:15,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:44:15,407 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=744626.6666666666, ans=0.125 2023-09-30 14:44:15,494 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=744626.6666666666, ans=0.0 2023-09-30 14:44:16,562 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:18,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:18,543 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:19,976 INFO [train.py:1039] (0/4) Epoch 22, batch 150, loss[loss=0.1505, simple_loss=0.233, pruned_loss=0.034, over 24486.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2539, pruned_loss=0.05048, over 2502595.33 frames. ], batch size: 63, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:44:20,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:44:21,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:44:24,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:27,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:27,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:44:27,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:29,313 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=744693.3333333334, ans=0.125 2023-09-30 14:44:30,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:30,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:34,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:44:36,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:40,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 14:44:40,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 14:44:40,587 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 14:44:43,646 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:44:43,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:44:45,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:44:47,272 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:47,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:47,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:47,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:48,925 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 14:44:51,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:57,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:00,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:45:01,745 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 14:45:05,011 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=744826.6666666666, ans=0.125 2023-09-30 14:45:06,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:45:06,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:08,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:09,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:45:11,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:45:12,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:45:12,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:12,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 14:45:17,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:19,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:19,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:45:19,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:45:22,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:24,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 14:45:26,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:45:28,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:45:29,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:31,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:45:32,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 14:45:32,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:32,890 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 14:45:37,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:40,980 INFO [train.py:1039] (0/4) Epoch 22, batch 200, loss[loss=0.1906, simple_loss=0.2741, pruned_loss=0.05352, over 23716.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.254, pruned_loss=0.05046, over 2994016.99 frames. ], batch size: 85, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:45:41,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:45:41,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:45:45,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 14:45:45,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:45,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=745026.6666666666, ans=0.0 2023-09-30 14:45:47,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:48,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 14:45:50,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:45:51,873 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:53,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:58,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:45:58,314 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:58,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:05,543 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.853e+02 2.014e+02 2.371e+02 3.492e+02, threshold=4.028e+02, percent-clipped=0.0 2023-09-30 14:46:16,654 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=745160.0, ans=0.125 2023-09-30 14:46:18,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:46:19,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:46:20,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:46:20,380 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:46:21,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:46:23,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 14:46:23,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:46:23,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:24,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:46:25,124 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=745160.0, ans=0.125 2023-09-30 14:46:26,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:26,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:46:28,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 14:46:28,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:46:28,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:30,396 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=745226.6666666666, ans=0.0 2023-09-30 14:46:35,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:46:41,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:47,850 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:47,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:46:55,689 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.32 vs. limit=22.5 2023-09-30 14:46:57,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:58,285 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=745293.3333333334, ans=0.0 2023-09-30 14:47:00,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 14:47:00,981 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:00,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:47:02,299 INFO [train.py:1039] (0/4) Epoch 22, batch 250, loss[loss=0.1782, simple_loss=0.2618, pruned_loss=0.04735, over 24353.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2539, pruned_loss=0.05037, over 3373683.99 frames. ], batch size: 74, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:47:02,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:02,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:47:04,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 14:47:04,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:47:04,164 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 14:47:04,506 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=745360.0, ans=0.0 2023-09-30 14:47:06,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:09,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:47:09,558 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:12,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:13,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:47:15,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:16,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:47:19,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:47:24,325 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=745426.6666666666, ans=0.0 2023-09-30 14:47:32,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:35,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:35,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:47:43,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:47:43,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:47:45,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:47:45,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:47,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:47:47,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:47:47,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:49,517 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.19 vs. limit=6.0 2023-09-30 14:47:52,554 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:47:55,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 14:47:55,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:56,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:47:56,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:47:56,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:47:57,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:47:57,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:47:58,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:48:00,121 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:01,712 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:48:03,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:05,310 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:48:09,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:12,387 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.88 vs. limit=5.0 2023-09-30 14:48:12,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:48:18,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:19,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:48:25,014 INFO [train.py:1039] (0/4) Epoch 22, batch 300, loss[loss=0.1701, simple_loss=0.2368, pruned_loss=0.05172, over 23766.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2513, pruned_loss=0.04935, over 3668104.38 frames. ], batch size: 164, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:48:25,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 14:48:25,218 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:48:25,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:48:28,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 14:48:28,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:48:29,780 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:48:29,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 14:48:30,407 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.99 vs. limit=15.0 2023-09-30 14:48:31,988 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.16 vs. limit=15.0 2023-09-30 14:48:34,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:36,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:48:39,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:48:41,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 14:48:42,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:44,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:48:44,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 14:48:44,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:48:49,077 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.843e+02 2.061e+02 2.404e+02 3.309e+02, threshold=4.123e+02, percent-clipped=0.0 2023-09-30 14:48:49,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=745760.0, ans=0.125 2023-09-30 14:48:50,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:48:53,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:48:53,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 14:48:56,925 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 14:48:57,008 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:00,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:02,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:02,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 14:49:02,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:49:05,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:49:06,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:49:06,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:10,637 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:49:10,644 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 14:49:12,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:49:15,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:16,604 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 14:49:16,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:22,230 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:49:25,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:49:25,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 14:49:29,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:29,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:49:32,549 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:34,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:49:34,268 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=745960.0, ans=0.0 2023-09-30 14:49:35,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 14:49:35,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:49:35,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:37,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 14:49:38,799 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:38,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:41,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:41,262 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=745960.0, ans=0.0 2023-09-30 14:49:41,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=745960.0, ans=0.0 2023-09-30 14:49:42,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:42,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:44,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=745960.0, ans=0.1 2023-09-30 14:49:47,002 INFO [train.py:1039] (0/4) Epoch 22, batch 350, loss[loss=0.154, simple_loss=0.2321, pruned_loss=0.03794, over 24273.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2491, pruned_loss=0.04846, over 3871515.89 frames. ], batch size: 56, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:49:48,638 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:49:48,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:49:53,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:57,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:58,956 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=746026.6666666666, ans=0.1 2023-09-30 14:50:01,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:01,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:05,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 14:50:07,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:07,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 14:50:10,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:10,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 14:50:12,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:15,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 14:50:18,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:50:20,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:20,973 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=746160.0, ans=0.125 2023-09-30 14:50:22,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:50:23,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:23,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:23,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:50:27,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:50:27,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:30,770 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=746160.0, ans=0.0 2023-09-30 14:50:35,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:50:35,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:50:35,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:50:35,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:37,111 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=746226.6666666666, ans=0.2 2023-09-30 14:50:40,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 14:50:40,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:43,226 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=746226.6666666666, ans=0.125 2023-09-30 14:50:45,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:45,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:50:45,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:47,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 14:50:49,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:50:49,647 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 14:50:51,320 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 14:50:52,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:55,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:55,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 14:50:56,015 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=746293.3333333334, ans=0.05 2023-09-30 14:50:58,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:00,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:51:04,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:05,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:05,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:07,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:10,210 INFO [train.py:1039] (0/4) Epoch 22, batch 400, loss[loss=0.1747, simple_loss=0.2417, pruned_loss=0.05383, over 23788.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2488, pruned_loss=0.04851, over 4060059.53 frames. ], batch size: 212, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:51:10,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:51:11,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:51:13,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 14:51:13,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:15,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:16,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:51:17,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:20,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:22,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:24,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 14:51:25,988 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=746426.6666666666, ans=0.125 2023-09-30 14:51:27,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 14:51:27,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:28,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 14:51:30,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:33,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:51:33,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:33,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 14:51:34,930 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.861e+02 2.105e+02 2.651e+02 3.953e+02, threshold=4.209e+02, percent-clipped=0.0 2023-09-30 14:51:35,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:51:35,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:35,371 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=746426.6666666666, ans=0.125 2023-09-30 14:51:36,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:36,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:38,926 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 14:51:40,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 14:51:45,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:46,629 WARNING [train.py:1197] (0/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:46,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 14:51:47,104 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=746493.3333333334, ans=0.0 2023-09-30 14:51:50,068 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 14:51:53,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:51:55,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:02,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 14:52:05,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:52:05,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=746560.0, ans=0.125 2023-09-30 14:52:07,363 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.47 vs. limit=15.0 2023-09-30 14:52:08,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 14:52:09,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:52:12,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:52:12,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 14:52:15,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:52:18,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:52:19,165 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.64 vs. limit=10.0 2023-09-30 14:52:19,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:52:23,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:23,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 14:52:24,059 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=746626.6666666666, ans=0.125 2023-09-30 14:52:25,515 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-112000.pt 2023-09-30 14:52:30,007 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:52:30,236 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=746626.6666666666, ans=0.2 2023-09-30 14:52:31,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 14:52:33,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:52:33,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:52:35,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 14:52:35,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:52:35,723 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=746626.6666666666, ans=0.125 2023-09-30 14:52:36,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:52:37,762 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.86 vs. limit=22.5 2023-09-30 14:52:39,084 INFO [train.py:1039] (0/4) Epoch 22, batch 450, loss[loss=0.1983, simple_loss=0.2672, pruned_loss=0.06467, over 22745.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2506, pruned_loss=0.04885, over 4208328.66 frames. ], batch size: 322, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:52:39,145 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:52:40,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 14:52:41,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:52:42,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:52:43,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:52:43,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 14:52:43,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:52:45,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:52:48,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:52:48,571 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:52:56,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:58,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:52:59,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 14:52:59,829 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 14:53:03,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:53:03,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=746760.0, ans=0.0 2023-09-30 14:53:06,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:08,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:14,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:16,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:18,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 14:53:19,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 14:53:21,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 14:53:21,443 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:53:23,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:24,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:53:25,131 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 14:53:25,146 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 14:53:26,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:28,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:53:28,377 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:53:32,815 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:53:32,870 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:53:34,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 14:53:34,427 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 14:53:36,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:38,382 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:53:38,440 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:53:39,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 14:53:45,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:53:46,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 14:53:46,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 14:53:48,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:53,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:53:53,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=746960.0, ans=0.125 2023-09-30 14:53:56,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:53:58,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:53:58,622 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 14:54:01,598 INFO [train.py:1039] (0/4) Epoch 22, batch 500, loss[loss=0.1735, simple_loss=0.2519, pruned_loss=0.04759, over 23299.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2516, pruned_loss=0.04959, over 4319980.98 frames. ], batch size: 93, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:54:01,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:03,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:54:03,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:03,314 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 14:54:04,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 14:54:04,976 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:08,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=747026.6666666666, ans=0.1 2023-09-30 14:54:09,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:54:13,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:54:16,107 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:54:17,879 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:54:17,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:19,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:26,501 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.831e+02 2.107e+02 2.492e+02 3.806e+02, threshold=4.214e+02, percent-clipped=0.0 2023-09-30 14:54:30,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:31,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:54:31,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:54:31,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:32,483 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.73 vs. limit=6.0 2023-09-30 14:54:33,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 14:54:33,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:54:38,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:54:39,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:54:39,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:54:39,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:41,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 14:54:41,260 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=747160.0, ans=0.1 2023-09-30 14:54:44,438 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=747160.0, ans=0.125 2023-09-30 14:54:45,666 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 14:54:47,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:54:49,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:49,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=747226.6666666666, ans=0.0 2023-09-30 14:54:51,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,442 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:54:52,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:54:55,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 14:54:59,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:55:01,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:04,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:09,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:55:15,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:17,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 14:55:17,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:17,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:20,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 14:55:22,224 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:55:24,281 INFO [train.py:1039] (0/4) Epoch 22, batch 550, loss[loss=0.1789, simple_loss=0.2504, pruned_loss=0.05369, over 23658.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.252, pruned_loss=0.04962, over 4406008.68 frames. ], batch size: 149, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 14:55:24,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:27,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 14:55:30,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 14:55:30,701 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:30,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 14:55:30,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:55:32,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:32,829 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:33,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:33,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:55:33,975 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=747360.0, ans=0.0 2023-09-30 14:55:35,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:55:38,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:39,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 14:55:39,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:55:41,910 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.96 vs. limit=15.0 2023-09-30 14:55:46,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:55:46,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:47,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:55:47,913 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=747426.6666666666, ans=0.2 2023-09-30 14:55:49,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:51,014 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=747426.6666666666, ans=0.125 2023-09-30 14:55:53,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 14:55:54,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 14:55:54,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=747426.6666666666, ans=0.125 2023-09-30 14:55:55,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:56:02,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:56:02,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:03,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:56:09,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:09,083 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 14:56:10,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:56:12,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 14:56:14,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:15,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:56:15,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:56:17,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:17,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 14:56:19,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 14:56:21,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:21,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:56:21,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:56:21,315 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:56:24,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:56:25,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:56:28,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:56:29,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:30,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:56:32,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:56:34,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:35,684 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:56:35,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:38,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:56:38,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:56:39,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=747626.6666666666, ans=0.1 2023-09-30 14:56:44,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 14:56:47,700 INFO [train.py:1039] (0/4) Epoch 22, batch 600, loss[loss=0.1755, simple_loss=0.2572, pruned_loss=0.04692, over 24648.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2523, pruned_loss=0.04942, over 4479750.35 frames. ], batch size: 65, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:56:49,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 14:56:50,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:56:50,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:56:50,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:57,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:57:00,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:57:01,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 14:57:03,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:57:07,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:08,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:11,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 14:57:11,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:57:13,345 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.858e+02 2.031e+02 2.339e+02 3.248e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 14:57:18,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 14:57:22,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:57:22,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:23,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:57:28,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:57:28,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:57:30,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:37,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:57:40,959 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:40,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:40,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:49,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 14:57:54,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:57:56,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:57:59,364 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 14:58:01,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:58:04,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 14:58:04,768 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:58:04,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:58:07,226 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=3.96 vs. limit=15.0 2023-09-30 14:58:09,816 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=748026.6666666666, ans=0.125 2023-09-30 14:58:10,797 INFO [train.py:1039] (0/4) Epoch 22, batch 650, loss[loss=0.1563, simple_loss=0.2178, pruned_loss=0.04737, over 22744.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2517, pruned_loss=0.04967, over 4525523.69 frames. ], batch size: 322, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:58:10,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:58:12,565 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:58:16,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:58:17,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:58:19,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:21,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 14:58:23,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:58:29,866 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:58:29,867 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:34,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:38,592 WARNING [train.py:1197] (0/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 14:58:40,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:58:41,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:44,822 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:58:44,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 14:58:47,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:47,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:48,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:58:51,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:51,647 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:58:51,947 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=748160.0, ans=0.0 2023-09-30 14:58:54,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:58:54,750 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 14:58:54,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:54,799 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:58:57,653 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.97 vs. limit=15.0 2023-09-30 14:58:58,662 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=748160.0, ans=0.125 2023-09-30 14:58:59,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:59,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:01,283 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:01,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:59:02,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 14:59:04,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:59:04,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:59:06,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:59:06,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:08,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:59:09,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 14:59:11,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 14:59:11,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:11,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:59:12,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:59:12,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:59:14,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:59:20,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:20,710 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:59:22,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:59:23,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:23,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 14:59:25,746 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:32,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:59:32,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:32,515 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:32,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:33,881 INFO [train.py:1039] (0/4) Epoch 22, batch 700, loss[loss=0.1889, simple_loss=0.2557, pruned_loss=0.06107, over 23842.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2495, pruned_loss=0.04941, over 4552429.92 frames. ], batch size: 164, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:59:37,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 14:59:37,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 14:59:41,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 14:59:41,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:43,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:59:45,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 14:59:50,352 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:50,498 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=748426.6666666666, ans=0.0 2023-09-30 14:59:53,563 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:59:54,071 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.00 vs. limit=15.0 2023-09-30 14:59:55,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:58,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:59:58,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:00:00,128 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.824e+02 1.972e+02 2.211e+02 2.960e+02, threshold=3.944e+02, percent-clipped=0.0 2023-09-30 15:00:00,593 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=748426.6666666666, ans=0.5 2023-09-30 15:00:01,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:00:04,585 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.78 vs. limit=22.5 2023-09-30 15:00:07,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:00:07,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:00:07,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 15:00:07,882 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.86 vs. limit=22.5 2023-09-30 15:00:11,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 15:00:15,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:00:16,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:00:18,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:00:22,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:00:23,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 15:00:25,352 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=748560.0, ans=0.125 2023-09-30 15:00:26,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:26,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:00:28,246 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 15:00:30,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:00:32,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:35,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:00:40,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:00:41,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 15:00:45,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 15:00:45,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 15:00:47,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=748626.6666666666, ans=0.125 2023-09-30 15:00:49,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:52,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:00:53,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:00:53,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:54,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 15:00:56,881 INFO [train.py:1039] (0/4) Epoch 22, batch 750, loss[loss=0.1899, simple_loss=0.2621, pruned_loss=0.05883, over 23860.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2498, pruned_loss=0.0493, over 4590003.64 frames. ], batch size: 179, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 15:00:58,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 15:00:58,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 15:00:58,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 15:01:00,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 15:01:00,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 15:01:00,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:01:01,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 15:01:03,814 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:01:03,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:05,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:06,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:08,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:01:08,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:13,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:01:14,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:01:16,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:01:20,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:20,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:20,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 15:01:24,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:01:24,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:25,780 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:27,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:01:28,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 15:01:28,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:01:30,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 15:01:30,451 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 15:01:31,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 15:01:31,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:01:32,452 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.58 vs. limit=22.5 2023-09-30 15:01:33,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:01:34,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:01:38,759 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=748826.6666666666, ans=0.0 2023-09-30 15:01:41,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:41,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:01:41,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:01:41,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:45,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:45,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 15:01:46,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:01:46,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:01:48,308 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:01:52,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:01:54,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 15:01:54,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:00,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:01,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:02:03,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:06,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:02:06,423 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:02:10,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 15:02:10,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:10,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:13,093 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:13,222 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=748960.0, ans=0.125 2023-09-30 15:02:14,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:16,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:16,698 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=748960.0, ans=0.1 2023-09-30 15:02:18,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:02:19,358 INFO [train.py:1039] (0/4) Epoch 22, batch 800, loss[loss=0.1885, simple_loss=0.2533, pruned_loss=0.06184, over 23553.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2504, pruned_loss=0.04854, over 4633630.65 frames. ], batch size: 256, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:02:26,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:26,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:28,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:28,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:30,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:30,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:32,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=749026.6666666666, ans=0.05 2023-09-30 15:02:34,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:37,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:38,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:02:40,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 15:02:41,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:42,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:42,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:02:43,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:43,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 15:02:45,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:45,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 15:02:47,015 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.835e+02 2.019e+02 2.284e+02 4.447e+02, threshold=4.039e+02, percent-clipped=1.0 2023-09-30 15:02:48,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:51,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:54,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:55,022 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=749160.0, ans=0.0 2023-09-30 15:02:56,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:59,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:59,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:01,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:03,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:03:03,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 15:03:06,948 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 15:03:06,993 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 15:03:07,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:03:08,354 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:09,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:10,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:15,299 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 15:03:15,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 15:03:16,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:03:18,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:03:22,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:03:25,416 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:25,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=749293.3333333334, ans=0.2 2023-09-30 15:03:27,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 15:03:28,234 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.51 vs. limit=22.5 2023-09-30 15:03:28,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:03:30,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 15:03:38,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:42,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:03:42,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 15:03:42,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:03:44,064 INFO [train.py:1039] (0/4) Epoch 22, batch 850, loss[loss=0.1622, simple_loss=0.2375, pruned_loss=0.04342, over 22004.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2504, pruned_loss=0.04876, over 4655874.82 frames. ], batch size: 48, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:03:44,229 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:45,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 15:03:45,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:47,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:03:49,210 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:50,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:03:52,255 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:52,436 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 15:03:53,955 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 15:03:53,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 15:03:54,585 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.51 vs. limit=15.0 2023-09-30 15:03:55,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:55,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:59,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:59,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:59,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:04:04,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:04,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:04,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 15:04:07,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 15:04:12,893 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:13,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 15:04:14,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 15:04:16,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 15:04:16,666 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=749493.3333333334, ans=0.1 2023-09-30 15:04:19,520 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 15:04:19,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:19,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:04:20,186 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:04:24,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:24,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:25,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 15:04:26,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:28,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:28,271 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:04:29,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:04:31,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:04:32,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:04:32,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 15:04:36,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:04:36,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:04:38,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:04:38,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:04:41,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:44,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:46,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:04:49,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:04:49,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:04:50,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:05:00,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:05:02,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:05:02,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 15:05:02,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:02,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:05:02,532 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:05:05,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 15:05:06,775 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.93 vs. limit=15.0 2023-09-30 15:05:07,236 INFO [train.py:1039] (0/4) Epoch 22, batch 900, loss[loss=0.1773, simple_loss=0.2438, pruned_loss=0.05542, over 23846.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2512, pruned_loss=0.04904, over 4679077.00 frames. ], batch size: 164, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:05:13,360 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:05:13,692 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=749693.3333333334, ans=0.025 2023-09-30 15:05:18,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:18,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 15:05:21,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:05:22,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 15:05:22,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=749760.0, ans=0.125 2023-09-30 15:05:22,339 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=749760.0, ans=0.0 2023-09-30 15:05:23,556 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:05:25,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:25,127 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:25,204 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:05:25,960 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=12.0 2023-09-30 15:05:26,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:05:30,182 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=749760.0, ans=0.1 2023-09-30 15:05:32,804 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.809e+02 2.095e+02 2.574e+02 4.591e+02, threshold=4.190e+02, percent-clipped=1.0 2023-09-30 15:05:36,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:05:36,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:36,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:05:38,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:43,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 15:05:44,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:05:48,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:05:49,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:05:50,043 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 15:05:51,532 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 15:05:53,344 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=749826.6666666666, ans=0.05 2023-09-30 15:05:59,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:05:59,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:05:59,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:06:07,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:07,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:06:09,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 15:06:09,158 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:06:10,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:06:12,650 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 15:06:15,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:06:15,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:17,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:06:17,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:24,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 15:06:25,476 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 15:06:25,694 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:06:25,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 15:06:28,570 INFO [train.py:1039] (0/4) Epoch 22, batch 950, loss[loss=0.1775, simple_loss=0.2544, pruned_loss=0.05031, over 24479.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2511, pruned_loss=0.04915, over 4695840.35 frames. ], batch size: 66, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:06:28,794 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:33,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 15:06:34,953 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=750026.6666666666, ans=15.0 2023-09-30 15:06:38,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:41,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:41,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:43,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:06:46,322 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 15:06:49,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:50,627 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:06:52,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:52,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:06:52,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 15:06:52,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:06:55,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:56,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 15:06:56,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:07:00,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:00,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:07:00,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:07:00,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 15:07:01,227 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=750160.0, ans=0.025 2023-09-30 15:07:04,566 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:07:06,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:07:07,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:07:12,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:07:12,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:07:16,640 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 15:07:20,126 WARNING [train.py:1197] (0/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:07:20,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:07:20,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:22,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:22,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:07:24,267 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=750226.6666666666, ans=0.09899494936611666 2023-09-30 15:07:27,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 15:07:27,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:07:27,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=750226.6666666666, ans=0.1 2023-09-30 15:07:30,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:32,376 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:32,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 15:07:32,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:32,432 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:07:32,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 15:07:37,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:07:40,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:45,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:07:47,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 15:07:47,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 15:07:51,490 INFO [train.py:1039] (0/4) Epoch 22, batch 1000, loss[loss=0.1491, simple_loss=0.2246, pruned_loss=0.03675, over 24266.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2506, pruned_loss=0.04899, over 4701606.49 frames. ], batch size: 56, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:07:54,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:58,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 15:07:58,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:03,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:08:05,297 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 15:08:05,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 15:08:10,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:10,595 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:08:14,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:14,573 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=750426.6666666666, ans=0.0 2023-09-30 15:08:15,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 15:08:18,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 15:08:20,249 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.832e+02 2.013e+02 2.220e+02 3.757e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 15:08:21,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 15:08:21,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:24,820 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 15:08:25,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 15:08:25,959 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.43 vs. limit=15.0 2023-09-30 15:08:26,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 15:08:27,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:28,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:37,307 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=750493.3333333334, ans=0.125 2023-09-30 15:08:38,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:39,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:08:40,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:40,110 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:40,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 15:08:41,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:43,807 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:08:43,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:43,971 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 15:08:49,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 15:08:50,677 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 15:08:52,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 15:08:52,718 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=750560.0, ans=0.1 2023-09-30 15:08:53,958 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:08:58,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:58,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:08:59,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:00,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:09:00,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=750626.6666666666, ans=0.0 2023-09-30 15:09:02,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 15:09:02,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:09:02,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 15:09:04,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 15:09:04,986 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:04,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:09:06,529 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=750626.6666666666, ans=0.125 2023-09-30 15:09:09,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:09:11,067 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:09:14,054 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:09:15,503 INFO [train.py:1039] (0/4) Epoch 22, batch 1050, loss[loss=0.1641, simple_loss=0.2404, pruned_loss=0.04388, over 20839.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2487, pruned_loss=0.04865, over 4687995.38 frames. ], batch size: 45, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:09:17,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:09:21,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:09:22,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:09:24,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:24,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:29,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:09:30,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:09:32,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:09:33,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:09:33,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:09:34,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:09:35,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 15:09:35,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:37,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 15:09:41,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:41,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 15:09:41,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:09:44,628 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=750760.0, ans=0.125 2023-09-30 15:09:49,044 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:49,645 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.77 vs. limit=15.0 2023-09-30 15:09:49,676 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.69 vs. limit=22.5 2023-09-30 15:09:50,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:09:50,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:52,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 15:09:54,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 15:09:54,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:54,441 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=750826.6666666666, ans=0.0 2023-09-30 15:09:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 15:09:58,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 15:09:58,300 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=750826.6666666666, ans=0.125 2023-09-30 15:09:59,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:02,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:10:04,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:10:05,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:06,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:10:10,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:10:14,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 15:10:16,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 15:10:16,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 15:10:16,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:17,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:10:19,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 15:10:23,236 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:10:24,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:24,729 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:10:24,831 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:24,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:29,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:29,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 15:10:32,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:32,808 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 15:10:32,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 15:10:34,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:10:36,132 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=750960.0, ans=0.125 2023-09-30 15:10:36,249 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=750960.0, ans=0.0 2023-09-30 15:10:37,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:10:37,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=751026.6666666666, ans=0.0 2023-09-30 15:10:38,852 INFO [train.py:1039] (0/4) Epoch 22, batch 1100, loss[loss=0.164, simple_loss=0.242, pruned_loss=0.04303, over 24404.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.248, pruned_loss=0.04827, over 4699601.72 frames. ], batch size: 58, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:10:42,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=751026.6666666666, ans=0.0 2023-09-30 15:10:45,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:10:50,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:10:53,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:10:53,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:10:53,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 15:10:55,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:56,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:10:59,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:11:02,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:11:02,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 15:11:05,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:11:06,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:06,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:11:08,138 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.829e+02 2.096e+02 2.478e+02 4.106e+02, threshold=4.191e+02, percent-clipped=1.0 2023-09-30 15:11:09,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:11:09,922 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:11:10,153 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=751160.0, ans=0.0 2023-09-30 15:11:16,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:11:16,545 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=751160.0, ans=0.1 2023-09-30 15:11:19,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 15:11:19,940 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 15:11:21,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:22,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:24,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:11:24,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:11:26,533 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 15:11:27,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:11:27,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:11:28,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:11:29,446 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:29,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 15:11:31,836 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.72 vs. limit=15.0 2023-09-30 15:11:34,913 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:11:34,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 15:11:37,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:11:41,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:11:44,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 15:11:44,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:11:45,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:47,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:47,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=751293.3333333334, ans=0.125 2023-09-30 15:11:48,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:50,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 15:11:52,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:11:52,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:52,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 15:11:54,052 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:11:54,132 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 15:11:56,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:11:56,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:11:57,691 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:12:00,852 INFO [train.py:1039] (0/4) Epoch 22, batch 1150, loss[loss=0.1568, simple_loss=0.244, pruned_loss=0.03479, over 24445.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.249, pruned_loss=0.04885, over 4690716.77 frames. ], batch size: 69, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:12:01,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:04,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:12:07,995 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:08,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:12:08,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 15:12:09,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:09,820 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=751360.0, ans=0.125 2023-09-30 15:12:12,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 15:12:15,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:15,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:12:19,377 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.35 vs. limit=22.5 2023-09-30 15:12:20,763 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-09-30 15:12:21,581 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 15:12:23,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:28,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:29,674 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:29,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 15:12:29,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:12:29,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:35,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 15:12:36,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:38,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:40,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=751493.3333333334, ans=0.125 2023-09-30 15:12:48,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:55,265 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:56,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 15:12:56,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:12:56,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:01,974 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 15:13:03,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:05,368 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751626.6666666666, ans=0.1 2023-09-30 15:13:06,904 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=751626.6666666666, ans=0.125 2023-09-30 15:13:10,498 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 15:13:17,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:18,651 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:13:18,699 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:13:18,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:13:22,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:23,929 INFO [train.py:1039] (0/4) Epoch 22, batch 1200, loss[loss=0.1874, simple_loss=0.2694, pruned_loss=0.05272, over 24041.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2508, pruned_loss=0.04917, over 4701408.57 frames. ], batch size: 80, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:13:27,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:13:27,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:13:28,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:28,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:30,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:13:33,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:13:34,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751693.3333333334, ans=0.1 2023-09-30 15:13:35,370 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:13:35,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:35,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:38,645 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 15:13:38,935 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=751760.0, ans=0.125 2023-09-30 15:13:40,501 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=751760.0, ans=0.0 2023-09-30 15:13:41,735 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 15:13:44,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:13:45,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:13:47,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:51,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:13:51,376 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 15:13:52,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:55,738 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.455e+02 1.831e+02 1.967e+02 2.216e+02 3.733e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 15:13:58,389 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.65 vs. limit=15.0 2023-09-30 15:14:00,699 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:14:00,719 WARNING [train.py:1197] (0/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:14:00,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 15:14:00,874 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:14:05,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 15:14:10,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 15:14:10,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:14:11,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:14:13,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:13,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:14:15,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:14:15,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:14:15,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:14:16,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 15:14:18,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:14:19,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:19,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:14:22,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:22,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:24,606 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751893.3333333334, ans=0.1 2023-09-30 15:14:27,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:14:27,683 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=751893.3333333334, ans=0.0 2023-09-30 15:14:29,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:14:32,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 15:14:36,892 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 15:14:39,038 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=751960.0, ans=0.0 2023-09-30 15:14:40,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:14:42,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:42,615 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=751960.0, ans=0.125 2023-09-30 15:14:43,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:14:45,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:46,633 INFO [train.py:1039] (0/4) Epoch 22, batch 1250, loss[loss=0.1767, simple_loss=0.2623, pruned_loss=0.0456, over 24003.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2513, pruned_loss=0.04875, over 4721567.99 frames. ], batch size: 80, lr: 4.70e-03, grad_scale: 4.0 2023-09-30 15:14:48,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 15:14:52,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:14:53,000 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:14:54,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 15:14:55,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:14:56,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:15:00,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:15:02,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:02,220 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:15:02,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:06,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:15:10,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:15:10,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:15:10,950 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:12,533 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:14,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:17,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:18,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:15:24,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 15:15:24,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:15:26,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:26,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 15:15:28,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:28,398 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 15:15:28,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:28,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:32,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:36,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:37,435 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:15:39,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 15:15:39,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 15:15:40,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 15:15:43,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:15:43,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 15:15:43,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:48,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:15:48,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:15:50,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 15:15:50,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:15:51,663 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:15:51,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:15:53,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:54,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 15:15:57,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:57,949 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:15:59,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:16:01,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:16:06,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:16:08,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 15:16:08,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=752360.0, ans=0.0 2023-09-30 15:16:09,839 INFO [train.py:1039] (0/4) Epoch 22, batch 1300, loss[loss=0.1738, simple_loss=0.239, pruned_loss=0.05426, over 23537.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2513, pruned_loss=0.04896, over 4721808.60 frames. ], batch size: 256, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:16:11,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:11,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:16:13,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:14,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:16:16,313 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:16:17,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 15:16:24,518 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=752426.6666666666, ans=0.1 2023-09-30 15:16:25,715 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:16:25,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:16:27,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 15:16:27,811 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=752426.6666666666, ans=0.125 2023-09-30 15:16:30,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=752426.6666666666, ans=0.0 2023-09-30 15:16:31,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:16:35,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:37,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:16:37,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:39,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:41,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:16:41,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:16:41,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 15:16:43,121 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.847e+02 1.992e+02 2.255e+02 3.036e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 15:16:47,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:16:47,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:16:50,885 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 15:16:51,653 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.whiten.whitening_limit, batch_count=752493.3333333334, ans=12.0 2023-09-30 15:16:52,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:16:54,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:16:57,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:57,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 15:16:57,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:16:59,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 15:17:00,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:06,856 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:06,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:17:09,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 15:17:10,063 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 15:17:12,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 15:17:17,926 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:17:17,951 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=752626.6666666666, ans=0.1 2023-09-30 15:17:19,042 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:17:22,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 15:17:23,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:30,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 15:17:31,369 INFO [train.py:1039] (0/4) Epoch 22, batch 1350, loss[loss=0.1487, simple_loss=0.2284, pruned_loss=0.03445, over 24344.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2505, pruned_loss=0.04849, over 4727803.26 frames. ], batch size: 61, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:17:35,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:35,716 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=752693.3333333334, ans=0.1 2023-09-30 15:17:36,088 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.80 vs. limit=15.0 2023-09-30 15:17:38,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:17:39,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:41,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:41,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:17:43,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:48,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:50,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 15:17:50,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:17:52,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:17:54,735 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.87 vs. limit=15.0 2023-09-30 15:17:56,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 15:17:57,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:58,882 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:58,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 15:18:00,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 15:18:02,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 15:18:03,759 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:03,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 15:18:15,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:25,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:26,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:26,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 15:18:29,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:31,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 15:18:31,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:18:31,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:18:35,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:18:37,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 15:18:37,849 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=752960.0, ans=0.09899494936611666 2023-09-30 15:18:39,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:18:41,247 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=752960.0, ans=0.0 2023-09-30 15:18:45,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 15:18:47,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 15:18:49,113 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=752960.0, ans=0.1 2023-09-30 15:18:53,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 15:18:53,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:55,671 INFO [train.py:1039] (0/4) Epoch 22, batch 1400, loss[loss=0.1648, simple_loss=0.2405, pruned_loss=0.04455, over 24248.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2501, pruned_loss=0.04877, over 4721494.21 frames. ], batch size: 56, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:18:57,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:18:57,512 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:19:04,593 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 15:19:06,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 15:19:14,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:19:16,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:19,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:19:19,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:19:25,552 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:19:25,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:19:29,130 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.828e+02 2.054e+02 2.330e+02 3.479e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 15:19:31,711 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.56 vs. limit=22.5 2023-09-30 15:19:38,082 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:38,189 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:41,763 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.76 vs. limit=15.0 2023-09-30 15:19:42,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 15:19:44,028 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:19:44,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:19:45,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:19:45,773 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=753226.6666666666, ans=0.125 2023-09-30 15:19:47,763 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:47,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:19:47,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:19:48,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:19:50,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 15:19:50,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:19:55,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:58,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:20:08,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 15:20:08,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:20:11,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:20:13,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:20:16,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:17,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:20:19,329 INFO [train.py:1039] (0/4) Epoch 22, batch 1450, loss[loss=0.1879, simple_loss=0.2554, pruned_loss=0.06021, over 23810.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2493, pruned_loss=0.04883, over 4714331.67 frames. ], batch size: 195, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:20:21,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:20:23,311 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:20:23,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:23,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:20:24,345 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.63 vs. limit=22.5 2023-09-30 15:20:29,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:29,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:20:31,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:20:32,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 15:20:32,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:20:34,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 15:20:35,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:35,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:35,556 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 15:20:37,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:20:38,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:20:38,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 15:20:38,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:40,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:20:40,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=753426.6666666666, ans=0.125 2023-09-30 15:20:43,690 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:47,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:51,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:20:51,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:20:54,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:54,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:55,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:57,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:20:57,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:57,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:02,448 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 15:21:04,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:21:06,632 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.80 vs. limit=15.0 2023-09-30 15:21:07,260 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 15:21:08,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:10,405 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:21:11,928 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:13,639 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=753560.0, ans=0.0 2023-09-30 15:21:14,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 15:21:18,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:20,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 15:21:21,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 15:21:23,483 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:27,144 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:27,211 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:28,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 15:21:31,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 15:21:33,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 15:21:33,673 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:35,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:21:41,326 INFO [train.py:1039] (0/4) Epoch 22, batch 1500, loss[loss=0.1634, simple_loss=0.2374, pruned_loss=0.04472, over 23656.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2498, pruned_loss=0.04886, over 4715100.09 frames. ], batch size: 256, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:21:45,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 15:21:46,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:21:46,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:21:47,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:47,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:47,853 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=753693.3333333334, ans=0.2 2023-09-30 15:21:49,145 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:21:50,675 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 15:21:52,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:21:53,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:21:53,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:54,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:56,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:21:57,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:02,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:02,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 15:22:04,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:04,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:22:04,403 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:07,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 15:22:12,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 15:22:14,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:22:14,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 15:22:15,602 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.832e+02 2.019e+02 2.350e+02 4.853e+02, threshold=4.037e+02, percent-clipped=1.0 2023-09-30 15:22:17,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:22:19,069 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:22:20,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:22:21,692 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:21,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:22:23,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 15:22:23,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:22:24,738 WARNING [train.py:1197] (0/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:24,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 15:22:24,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:30,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:22:30,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 15:22:34,405 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=753893.3333333334, ans=0.125 2023-09-30 15:22:37,041 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:22:37,455 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=753893.3333333334, ans=0.125 2023-09-30 15:22:38,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:22:42,270 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 15:22:43,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:43,733 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 15:22:46,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:22:46,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:22:46,955 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=753960.0, ans=0.125 2023-09-30 15:22:48,228 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 15:22:49,658 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:52,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 15:22:52,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:53,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=753960.0, ans=0.125 2023-09-30 15:22:53,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=753960.0, ans=0.125 2023-09-30 15:22:57,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:57,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:58,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:58,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:59,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:23:03,215 INFO [train.py:1039] (0/4) Epoch 22, batch 1550, loss[loss=0.1809, simple_loss=0.2677, pruned_loss=0.04698, over 24570.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2506, pruned_loss=0.04947, over 4701840.36 frames. ], batch size: 71, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:23:03,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 15:23:04,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 15:23:04,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:23:06,919 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 15:23:06,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 15:23:08,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:10,232 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:10,286 WARNING [train.py:1197] (0/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:11,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:23:11,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:13,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:17,024 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 15:23:17,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:17,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:23:18,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:23:21,500 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:23:21,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 15:23:21,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:23,060 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 15:23:24,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 15:23:24,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 15:23:24,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:25,632 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.62 vs. limit=10.0 2023-09-30 15:23:26,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:30,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:23:32,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 15:23:32,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 15:23:34,824 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.38 vs. limit=12.0 2023-09-30 15:23:42,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:42,531 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=754160.0, ans=0.125 2023-09-30 15:23:46,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:48,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:23:48,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:23:49,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 15:23:53,428 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=754226.6666666666, ans=0.0 2023-09-30 15:23:56,011 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:23:56,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:59,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:24:00,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:24:02,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:02,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 15:24:03,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:05,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:24:05,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:06,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:24:06,976 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 15:24:10,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:16,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 15:24:22,786 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:22,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:24,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 15:24:24,516 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=754360.0, ans=0.1 2023-09-30 15:24:26,475 INFO [train.py:1039] (0/4) Epoch 22, batch 1600, loss[loss=0.1946, simple_loss=0.2597, pruned_loss=0.0647, over 23694.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2516, pruned_loss=0.04986, over 4705125.85 frames. ], batch size: 232, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:24:26,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:28,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:28,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:24:28,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:24:29,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:24:34,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:34,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 15:24:35,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 15:24:36,044 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=754360.0, ans=0.125 2023-09-30 15:24:38,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 15:24:40,393 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:24:41,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 15:24:42,054 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:24:43,810 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=754426.6666666666, ans=0.125 2023-09-30 15:24:45,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:24:50,954 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:54,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 15:24:57,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:24:59,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 15:24:59,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:59,255 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=754493.3333333334, ans=0.125 2023-09-30 15:25:00,928 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.869e+02 2.022e+02 2.307e+02 3.871e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 15:25:01,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 15:25:01,513 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=754493.3333333334, ans=0.125 2023-09-30 15:25:07,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 15:25:07,823 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=15.0 2023-09-30 15:25:09,103 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=754493.3333333334, ans=0.125 2023-09-30 15:25:13,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:15,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 15:25:16,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:16,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:25:16,485 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:25:18,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 15:25:25,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 15:25:26,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:25:26,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:28,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:28,243 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:25:28,463 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=754560.0, ans=0.09899494936611666 2023-09-30 15:25:31,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:25:32,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:25:33,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:25:40,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:40,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:25:43,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 15:25:43,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:25:45,956 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.00 vs. limit=15.0 2023-09-30 15:25:46,459 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 15:25:47,986 INFO [train.py:1039] (0/4) Epoch 22, batch 1650, loss[loss=0.1718, simple_loss=0.2398, pruned_loss=0.05195, over 23671.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2518, pruned_loss=0.04962, over 4711346.74 frames. ], batch size: 149, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:25:51,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:25:51,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:25:52,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:25:52,849 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 15:25:52,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 15:25:52,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 15:25:52,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 15:25:58,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:58,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:00,161 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:00,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:26:03,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:06,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 15:26:08,453 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:26:08,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:08,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:26:08,498 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:26:09,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 15:26:10,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 15:26:13,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=754760.0, ans=0.125 2023-09-30 15:26:16,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:26:19,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:26:27,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 15:26:27,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:28,946 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 15:26:32,905 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:35,173 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.23 vs. limit=15.0 2023-09-30 15:26:35,841 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:26:35,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:26:36,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=754893.3333333334, ans=0.2 2023-09-30 15:26:37,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:26:38,090 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:26:38,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:41,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:43,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:43,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:44,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:46,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:46,478 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=754893.3333333334, ans=0.1 2023-09-30 15:26:47,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:26:50,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:52,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 15:26:53,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:53,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 15:26:55,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 15:26:56,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 15:26:56,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:57,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:26:57,154 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:57,450 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=754960.0, ans=0.95 2023-09-30 15:26:58,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:58,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 15:26:59,039 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=754960.0, ans=0.0 2023-09-30 15:27:00,582 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=754960.0, ans=0.125 2023-09-30 15:27:01,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:27:04,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:04,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:07,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 15:27:10,571 INFO [train.py:1039] (0/4) Epoch 22, batch 1700, loss[loss=0.1578, simple_loss=0.2408, pruned_loss=0.03743, over 24481.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2511, pruned_loss=0.04955, over 4706370.24 frames. ], batch size: 66, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:27:12,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:12,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:27:14,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 15:27:16,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:16,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:27:16,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:17,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:27:18,137 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=755026.6666666666, ans=0.125 2023-09-30 15:27:19,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:27:19,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 15:27:22,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:27:27,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:30,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:27:37,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:27:37,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:27:37,251 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:39,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:27:42,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 15:27:44,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:27:44,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:44,340 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=755160.0, ans=0.125 2023-09-30 15:27:45,846 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.451e+02 1.865e+02 2.081e+02 2.403e+02 3.253e+02, threshold=4.162e+02, percent-clipped=0.0 2023-09-30 15:27:46,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:27:46,477 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=755160.0, ans=0.125 2023-09-30 15:27:48,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:27:49,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 15:27:51,408 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 15:27:51,594 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:53,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 15:27:54,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:56,839 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.91 vs. limit=15.0 2023-09-30 15:28:03,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:05,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:06,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:28:07,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:28:07,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 15:28:07,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:28:10,748 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:10,749 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 15:28:10,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:28:10,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:12,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:13,017 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:13,244 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=755226.6666666666, ans=0.125 2023-09-30 15:28:13,358 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=755226.6666666666, ans=0.1 2023-09-30 15:28:16,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:16,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:28:17,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:19,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:28:19,826 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:23,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:25,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 15:28:28,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:29,618 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:32,615 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 15:28:32,869 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=755360.0, ans=0.125 2023-09-30 15:28:34,062 INFO [train.py:1039] (0/4) Epoch 22, batch 1750, loss[loss=0.164, simple_loss=0.2379, pruned_loss=0.0451, over 23673.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2496, pruned_loss=0.04904, over 4693785.42 frames. ], batch size: 135, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:28:38,865 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:41,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:41,870 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:28:43,325 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 15:28:43,386 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:46,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:28:46,469 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:52,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 15:28:54,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:58,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 15:28:58,275 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:59,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:29:02,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:29:02,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 15:29:06,043 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:29:06,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 15:29:13,724 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:29:16,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:16,841 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:20,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:20,049 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:22,350 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:29:24,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:28,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:28,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:29:29,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 15:29:32,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:32,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=755560.0, ans=0.0 2023-09-30 15:29:35,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 15:29:36,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:37,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=755560.0, ans=0.125 2023-09-30 15:29:38,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:38,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:29:40,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=755626.6666666666, ans=0.0 2023-09-30 15:29:43,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:29:43,292 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=755626.6666666666, ans=0.0 2023-09-30 15:29:44,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:29:44,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:46,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:50,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:52,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:29:54,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:29:54,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 15:29:56,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:56,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:29:56,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:29:56,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:29:58,267 INFO [train.py:1039] (0/4) Epoch 22, batch 1800, loss[loss=0.1819, simple_loss=0.2662, pruned_loss=0.04885, over 24564.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2489, pruned_loss=0.04887, over 4682543.78 frames. ], batch size: 71, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:29:58,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:29:58,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:30:01,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:30:02,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:30:04,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:30:05,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:30:10,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:30:11,842 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:30:13,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:16,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:30:19,752 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:30:19,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 15:30:21,161 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:24,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:26,060 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=755760.0, ans=0.125 2023-09-30 15:30:27,888 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 15:30:31,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 15:30:31,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 15:30:33,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:34,970 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.922e+02 2.224e+02 2.607e+02 3.579e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-30 15:30:35,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:35,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:30:37,162 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:30:42,164 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 15:30:43,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:30:45,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:48,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 15:30:49,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 15:30:49,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:30:51,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:30:52,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:30:57,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 15:31:04,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:04,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 15:31:05,926 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:31:05,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:08,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:31:08,076 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 15:31:09,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:31:09,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:12,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 15:31:12,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:15,362 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:16,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:31:16,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:31:19,997 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:31:20,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:21,268 INFO [train.py:1039] (0/4) Epoch 22, batch 1850, loss[loss=0.1491, simple_loss=0.2292, pruned_loss=0.03447, over 24345.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2487, pruned_loss=0.04865, over 4687362.99 frames. ], batch size: 61, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:31:22,232 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=15.0 2023-09-30 15:31:24,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:31:24,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:31:32,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:31:32,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 15:31:35,385 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 15:31:39,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 15:31:44,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:44,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 15:31:45,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 15:31:54,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:55,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 15:31:58,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:31:58,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:02,217 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=756160.0, ans=0.125 2023-09-30 15:32:03,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 15:32:04,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:05,009 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:32:06,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:32:08,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:32:13,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:32:13,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756226.6666666666, ans=0.1 2023-09-30 15:32:17,042 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:32:17,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:17,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:32:17,151 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:18,643 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:20,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:32:23,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 15:32:25,201 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:29,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:32:31,253 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:32:31,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 15:32:31,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 15:32:34,184 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 15:32:34,311 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 15:32:35,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:32:35,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:35,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:35,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:37,492 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 15:32:37,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:32:37,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:39,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:32:40,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:32:41,044 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:32:41,988 INFO [train.py:1039] (0/4) Epoch 22, batch 1900, loss[loss=0.1879, simple_loss=0.256, pruned_loss=0.0599, over 23672.00 frames. ], tot_loss[loss=0.1733, simple_loss=0.2496, pruned_loss=0.0485, over 4702336.11 frames. ], batch size: 232, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:32:42,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:32:42,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 15:32:43,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:43,800 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 15:32:43,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:32:45,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:52,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:54,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:32:56,151 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 15:32:57,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 15:32:57,926 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=756426.6666666666, ans=0.125 2023-09-30 15:32:59,117 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:59,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:33:01,254 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 15:33:01,296 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 15:33:04,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 15:33:06,265 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=756426.6666666666, ans=0.125 2023-09-30 15:33:07,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:33:11,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 15:33:12,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 15:33:18,266 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.802e+02 1.975e+02 2.316e+02 3.738e+02, threshold=3.949e+02, percent-clipped=0.0 2023-09-30 15:33:23,481 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 15:33:27,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 15:33:27,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:33:28,023 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 15:33:28,029 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 15:33:28,269 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=756493.3333333334, ans=0.09899494936611666 2023-09-30 15:33:29,380 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 15:33:29,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 15:33:29,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:33:32,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 15:33:36,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:33:38,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:33:38,491 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 15:33:41,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:33:44,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 15:33:45,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:33:46,165 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=756560.0, ans=0.07 2023-09-30 15:33:53,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:33:53,414 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:33:53,434 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:33:53,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:33:55,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:33:57,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:33:58,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:34:00,898 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:00,901 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:03,837 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:34:03,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:05,172 INFO [train.py:1039] (0/4) Epoch 22, batch 1950, loss[loss=0.1883, simple_loss=0.2702, pruned_loss=0.05322, over 24637.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.249, pruned_loss=0.0479, over 4709331.30 frames. ], batch size: 68, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:34:05,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:34:06,889 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:10,066 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:12,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:34:13,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:13,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:34:17,300 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.08 vs. limit=15.0 2023-09-30 15:34:18,097 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 15:34:18,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:34:18,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:19,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:22,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:34:22,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:22,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:22,990 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=756760.0, ans=0.1 2023-09-30 15:34:25,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:34:26,056 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=756760.0, ans=0.125 2023-09-30 15:34:29,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:29,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:34:29,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:34:31,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:31,419 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=756760.0, ans=0.125 2023-09-30 15:34:33,009 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=756760.0, ans=0.07 2023-09-30 15:34:34,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:34,669 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=756760.0, ans=0.0 2023-09-30 15:34:38,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:38,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:38,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:34:38,725 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 15:34:40,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:34:40,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:34:41,712 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:46,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:47,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:52,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:34:52,784 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=756893.3333333334, ans=0.125 2023-09-30 15:34:57,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:34:57,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:34:57,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 15:34:59,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:03,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:35:03,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:35:05,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:14,845 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:14,962 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:15,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=756960.0, ans=0.125 2023-09-30 15:35:17,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:19,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:22,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:35:22,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:24,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 15:35:24,187 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:35:25,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:35:27,158 INFO [train.py:1039] (0/4) Epoch 22, batch 2000, loss[loss=0.1674, simple_loss=0.249, pruned_loss=0.04289, over 23928.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2496, pruned_loss=0.04807, over 4709756.11 frames. ], batch size: 86, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:35:27,234 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 15:35:29,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:32,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:34,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:35:34,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:37,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:35:38,660 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:40,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 15:35:42,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:35:44,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:35:46,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 15:35:49,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:35:49,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:52,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:35:54,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 15:35:54,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:54,847 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=757093.3333333334, ans=0.2 2023-09-30 15:35:56,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:57,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:57,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 15:35:59,099 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:36:00,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 15:36:00,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:01,583 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=757160.0, ans=0.125 2023-09-30 15:36:04,196 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.020e+02 2.308e+02 2.637e+02 3.987e+02, threshold=4.617e+02, percent-clipped=1.0 2023-09-30 15:36:04,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:05,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:36:05,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:05,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:07,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:08,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 15:36:11,898 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 15:36:11,909 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:11,921 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:12,477 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.98 vs. limit=15.0 2023-09-30 15:36:17,045 WARNING [train.py:1197] (0/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:17,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757226.6666666666, ans=0.1 2023-09-30 15:36:18,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:36:18,550 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:18,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:36:20,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:22,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:22,859 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.40 vs. limit=15.0 2023-09-30 15:36:23,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:23,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:24,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:27,105 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:27,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 15:36:29,018 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=757226.6666666666, ans=0.0 2023-09-30 15:36:33,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:36:33,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:38,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:38,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:36:41,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:43,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:43,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:45,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:36:45,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:36:50,065 INFO [train.py:1039] (0/4) Epoch 22, batch 2050, loss[loss=0.1565, simple_loss=0.232, pruned_loss=0.04054, over 24278.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2492, pruned_loss=0.04841, over 4710489.64 frames. ], batch size: 56, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:36:50,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:51,582 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:53,831 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=757360.0, ans=0.125 2023-09-30 15:36:55,143 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:55,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:37:01,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:37:01,824 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=757360.0, ans=0.0 2023-09-30 15:37:03,075 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:37:03,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:37:04,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:06,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 15:37:06,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:37:06,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:07,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:37:17,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:17,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:19,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 15:37:20,213 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=757426.6666666666, ans=0.09899494936611666 2023-09-30 15:37:22,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:23,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 15:37:23,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:24,172 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757493.3333333334, ans=0.1 2023-09-30 15:37:27,495 WARNING [train.py:1197] (0/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:27,726 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=757493.3333333334, ans=0.0 2023-09-30 15:37:30,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:32,113 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:37:32,176 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:35,064 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:37:36,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:37:36,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:37:39,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:41,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:37:43,753 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:37:45,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:48,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:37:53,118 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=757560.0, ans=0.0 2023-09-30 15:37:55,003 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:58,330 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 15:38:03,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:04,000 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=757626.6666666666, ans=0.2 2023-09-30 15:38:05,233 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:38:08,390 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:38:08,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 15:38:11,769 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 15:38:11,769 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:11,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:13,220 INFO [train.py:1039] (0/4) Epoch 22, batch 2100, loss[loss=0.1658, simple_loss=0.2589, pruned_loss=0.03631, over 24317.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2476, pruned_loss=0.04801, over 4708532.26 frames. ], batch size: 74, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:38:13,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:13,445 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:13,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 15:38:14,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 15:38:17,725 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:38:20,229 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=757693.3333333334, ans=0.0 2023-09-30 15:38:21,324 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:38:21,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:38:24,437 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:24,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:38:24,560 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 15:38:26,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:38:26,334 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=757693.3333333334, ans=0.125 2023-09-30 15:38:28,256 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 15:38:28,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 15:38:31,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:38:31,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:38:31,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 15:38:31,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 15:38:38,293 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 15:38:38,295 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:41,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:38:41,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:46,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:38:46,182 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 15:38:47,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:38:47,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:38:49,198 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.822e+02 2.007e+02 2.255e+02 3.053e+02, threshold=4.015e+02, percent-clipped=0.0 2023-09-30 15:38:49,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 15:38:49,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:49,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 15:38:50,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 15:38:50,978 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 15:38:52,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:38:56,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:38:57,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:38:58,311 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.58 vs. limit=22.5 2023-09-30 15:38:59,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:39:01,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:05,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 15:39:05,096 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:05,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:06,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 15:39:08,933 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 15:39:09,024 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 15:39:13,460 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:39:17,987 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:39:18,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 15:39:22,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:24,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:39:25,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:39:25,917 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:39:25,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:39:26,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:39:27,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:27,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:39:31,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:39:31,299 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:32,795 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 15:39:34,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 15:39:34,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:36,506 INFO [train.py:1039] (0/4) Epoch 22, batch 2150, loss[loss=0.169, simple_loss=0.2522, pruned_loss=0.04293, over 24296.00 frames. ], tot_loss[loss=0.1714, simple_loss=0.247, pruned_loss=0.04787, over 4714634.81 frames. ], batch size: 77, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:39:36,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:36,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:39:36,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:39:38,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:39:44,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:39:44,616 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=758026.6666666666, ans=0.1 2023-09-30 15:39:45,895 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:47,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:49,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:39:49,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:50,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:39:53,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:53,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:39:53,726 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:39:56,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:58,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 15:40:01,549 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:05,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:40:07,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:07,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:40:08,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:08,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:40:10,235 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:40:10,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 15:40:12,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:40:14,632 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:14,696 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:16,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:40:16,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:40:19,384 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:20,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:40:22,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:22,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 15:40:22,364 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:40:24,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:24,310 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=758160.0, ans=0.125 2023-09-30 15:40:25,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:26,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:28,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:40:28,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:30,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:30,031 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 15:40:32,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 15:40:33,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:40:34,530 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 15:40:34,606 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:34,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:40:36,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 15:40:36,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:40:36,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 15:40:36,383 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=758226.6666666666, ans=0.125 2023-09-30 15:40:37,561 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 15:40:37,562 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 15:40:37,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 15:40:39,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:40,047 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:40,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:40:41,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:42,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:40:44,463 WARNING [train.py:1197] (0/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:44,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:52,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:40:52,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 15:40:56,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=758293.3333333334, ans=0.2 2023-09-30 15:40:58,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:40:59,520 INFO [train.py:1039] (0/4) Epoch 22, batch 2200, loss[loss=0.186, simple_loss=0.2566, pruned_loss=0.05775, over 23421.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2482, pruned_loss=0.04821, over 4711174.63 frames. ], batch size: 285, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:41:00,046 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=758360.0, ans=0.1 2023-09-30 15:41:01,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:02,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:41:02,840 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:04,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:41:07,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:41:08,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:41:08,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 15:41:14,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 15:41:17,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:41:17,631 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=758426.6666666666, ans=0.125 2023-09-30 15:41:21,793 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=758426.6666666666, ans=0.125 2023-09-30 15:41:24,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 15:41:24,937 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:41:26,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:26,379 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:27,677 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:41:30,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:41:31,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 15:41:31,632 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.45 vs. limit=22.5 2023-09-30 15:41:34,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:41:34,430 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=758493.3333333334, ans=0.125 2023-09-30 15:41:36,961 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.820e+02 1.960e+02 2.258e+02 2.788e+02, threshold=3.920e+02, percent-clipped=0.0 2023-09-30 15:41:37,084 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:38,510 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 15:41:41,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:41:43,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:44,477 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.95 vs. limit=15.0 2023-09-30 15:41:45,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:41:46,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:48,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 15:41:50,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:53,060 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 15:41:56,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:56,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:41:56,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:58,735 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.24 vs. limit=10.0 2023-09-30 15:41:59,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:59,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:59,628 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:59,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:42:01,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:42:01,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:42:04,355 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:42:07,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:42:08,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:10,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:42:12,072 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 15:42:14,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:42:15,404 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 15:42:15,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:42:16,881 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 15:42:18,428 WARNING [train.py:1197] (0/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:19,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:42:20,057 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:21,843 INFO [train.py:1039] (0/4) Epoch 22, batch 2250, loss[loss=0.1709, simple_loss=0.2533, pruned_loss=0.04424, over 24467.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2493, pruned_loss=0.04851, over 4720483.13 frames. ], batch size: 63, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:42:22,019 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 15:42:25,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:42:27,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:27,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=758693.3333333334, ans=0.125 2023-09-30 15:42:34,254 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:42:35,858 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:42:38,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:39,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:40,610 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:42,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 15:42:42,288 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:42:42,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:42:43,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 15:42:45,482 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:42:45,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:45,789 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=758760.0, ans=0.1 2023-09-30 15:42:47,146 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:47,628 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.40 vs. limit=10.0 2023-09-30 15:42:51,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:53,614 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:42:55,072 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:42:56,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 15:42:56,851 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=758826.6666666666, ans=0.1 2023-09-30 15:42:58,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:43:03,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:43:07,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:09,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:10,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:43:10,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:43:13,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:43:15,148 WARNING [train.py:1197] (0/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:43:17,208 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.54 vs. limit=22.5 2023-09-30 15:43:19,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:43:21,168 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:43:24,540 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=758893.3333333334, ans=10.0 2023-09-30 15:43:25,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:43:25,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:43:27,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:43:33,800 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:43:36,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:43:36,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 15:43:36,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:37,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:43:41,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 15:43:44,066 INFO [train.py:1039] (0/4) Epoch 22, batch 2300, loss[loss=0.1804, simple_loss=0.2662, pruned_loss=0.04728, over 24364.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2505, pruned_loss=0.04866, over 4734354.61 frames. ], batch size: 77, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:43:44,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:43:44,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:45,157 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.30 vs. limit=22.5 2023-09-30 15:43:50,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:50,577 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:43:53,681 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 15:43:55,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:02,247 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:44:03,602 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:44:03,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:05,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:05,037 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 15:44:05,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:44:08,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:08,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:44:11,854 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:44:14,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:44:14,436 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=759093.3333333334, ans=0.125 2023-09-30 15:44:18,203 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.84 vs. limit=22.5 2023-09-30 15:44:18,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:21,553 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.881e+02 2.122e+02 2.530e+02 4.417e+02, threshold=4.245e+02, percent-clipped=2.0 2023-09-30 15:44:24,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:44:24,871 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:27,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:44:30,299 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.88 vs. limit=22.5 2023-09-30 15:44:31,083 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:44:35,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:35,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:44:37,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:44:37,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 15:44:41,360 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=759226.6666666666, ans=0.0 2023-09-30 15:44:42,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:44:42,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:43,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:44,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:44:45,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:44:45,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 15:44:45,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:44:47,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 15:44:47,094 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:44:47,116 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:49,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 15:44:55,359 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:44:59,883 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:45:04,656 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:04,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:45:04,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:45:06,152 INFO [train.py:1039] (0/4) Epoch 22, batch 2350, loss[loss=0.1784, simple_loss=0.2645, pruned_loss=0.04615, over 24364.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2509, pruned_loss=0.04877, over 4730959.16 frames. ], batch size: 77, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:45:07,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:45:07,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:07,857 WARNING [train.py:1197] (0/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:45:09,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 15:45:09,788 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=759360.0, ans=0.125 2023-09-30 15:45:14,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:45:14,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 15:45:22,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 15:45:25,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:45:29,495 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:29,578 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:31,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 15:45:34,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:45:38,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 15:45:40,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:41,084 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.81 vs. limit=10.0 2023-09-30 15:45:43,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:45:43,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:46,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:45:48,576 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 15:45:48,700 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:45:51,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:51,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:45:51,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:54,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:45:55,337 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=759560.0, ans=0.0 2023-09-30 15:45:56,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 15:45:58,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:46:02,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:46:03,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:46:05,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 15:46:05,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:46:08,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 15:46:08,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:46:13,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 15:46:17,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 15:46:19,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:46:19,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:46:19,955 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 15:46:21,343 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 15:46:22,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 15:46:26,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:46:27,432 INFO [train.py:1039] (0/4) Epoch 22, batch 2400, loss[loss=0.1705, simple_loss=0.2502, pruned_loss=0.04538, over 23395.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2507, pruned_loss=0.04889, over 4731247.53 frames. ], batch size: 105, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:46:30,775 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:46:31,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=759693.3333333334, ans=0.125 2023-09-30 15:46:34,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:46:35,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:46:37,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 15:46:37,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 15:46:37,347 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=759693.3333333334, ans=0.0 2023-09-30 15:46:44,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:46:44,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:46:46,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 15:46:47,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:46:49,452 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:49,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 15:46:52,764 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=759760.0, ans=0.1 2023-09-30 15:46:57,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:58,615 INFO [scaling.py:1022] (0/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.71 vs. limit=8.0 2023-09-30 15:46:59,004 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 15:47:02,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:47:05,700 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.822e+02 2.005e+02 2.211e+02 3.199e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 15:47:08,025 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 15:47:11,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:13,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:16,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:17,851 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 15:47:19,244 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:47:24,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:24,462 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=759893.3333333334, ans=0.0 2023-09-30 15:47:25,305 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.20 vs. limit=15.0 2023-09-30 15:47:27,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:47:30,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:47:32,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:47:32,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:47:32,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:47:32,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:32,782 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=759960.0, ans=0.0 2023-09-30 15:47:33,990 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:47:34,020 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:47:37,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=759960.0, ans=0.125 2023-09-30 15:47:38,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:47:40,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:47:40,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 15:47:43,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 15:47:44,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:44,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:46,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 15:47:46,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 15:47:47,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 15:47:47,704 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 15:47:47,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 15:47:49,341 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:50,806 INFO [train.py:1039] (0/4) Epoch 22, batch 2450, loss[loss=0.1722, simple_loss=0.2625, pruned_loss=0.04088, over 24314.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2482, pruned_loss=0.04868, over 4703889.33 frames. ], batch size: 74, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:47:50,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:50,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:47:52,420 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 15:47:52,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:52,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:47:57,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:47:57,258 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:00,982 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:00,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:02,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 15:48:04,357 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=760026.6666666666, ans=0.1 2023-09-30 15:48:07,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:07,484 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=760093.3333333334, ans=0.125 2023-09-30 15:48:08,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:10,485 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:48:10,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:48:11,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:48:12,034 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 15:48:17,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:20,275 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:48:20,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:48:23,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:48:23,694 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:24,318 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.17 vs. limit=15.0 2023-09-30 15:48:25,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:25,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:48:28,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 15:48:29,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:48:38,129 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:39,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:41,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:41,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:48:41,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:42,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:48:43,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 15:48:46,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:48,289 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:48:52,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:53,375 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:56,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=760293.3333333334, ans=0.125 2023-09-30 15:48:58,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:48:58,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 15:48:58,277 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:48:59,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:59,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 15:49:01,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:01,413 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:49:06,391 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:49:06,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=760293.3333333334, ans=0.125 2023-09-30 15:49:10,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:10,179 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:49:13,149 INFO [train.py:1039] (0/4) Epoch 22, batch 2500, loss[loss=0.1775, simple_loss=0.2657, pruned_loss=0.04461, over 24451.00 frames. ], tot_loss[loss=0.1722, simple_loss=0.2476, pruned_loss=0.04837, over 4697912.82 frames. ], batch size: 69, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:49:13,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 15:49:14,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:49:21,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:27,409 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=760360.0, ans=0.125 2023-09-30 15:49:31,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:49:31,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:49:33,259 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:33,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 15:49:35,312 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.75 vs. limit=22.5 2023-09-30 15:49:40,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:49:42,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:49:42,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:49:42,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:49:44,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 15:49:44,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:46,279 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:46,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 15:49:46,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:47,789 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 15:49:47,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:49,720 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:49:50,875 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.767e+02 1.934e+02 2.176e+02 2.965e+02, threshold=3.869e+02, percent-clipped=0.0 2023-09-30 15:49:52,689 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:52,790 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:56,479 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:49:56,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 15:49:57,335 WARNING [train.py:1197] (0/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:49:58,850 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:59,097 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=760493.3333333334, ans=0.0 2023-09-30 15:50:03,662 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:08,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:09,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:14,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:50:16,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 15:50:18,088 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:18,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:19,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:50:19,779 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:50:21,255 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 15:50:21,255 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 15:50:21,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 15:50:24,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:25,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 15:50:27,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 15:50:28,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:30,148 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 15:50:33,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 15:50:36,286 INFO [train.py:1039] (0/4) Epoch 22, batch 2550, loss[loss=0.1594, simple_loss=0.2454, pruned_loss=0.0367, over 24474.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2484, pruned_loss=0.04877, over 4699162.45 frames. ], batch size: 63, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:50:36,392 WARNING [train.py:1197] (0/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:37,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:50:37,941 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:50:40,971 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:41,085 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 15:50:42,486 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:50:45,751 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 15:50:47,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:50:48,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:52,585 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 15:50:53,994 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:50:54,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:50:54,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:54,271 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=760760.0, ans=0.0 2023-09-30 15:50:57,140 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:50:57,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 15:50:58,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:58,665 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:58,670 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 15:51:10,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:51:16,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:16,904 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:16,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:51:17,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:51:23,801 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:51:26,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:51:26,890 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:51:26,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:51:28,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:51:28,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:51:30,331 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=760893.3333333334, ans=0.1 2023-09-30 15:51:31,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:31,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:37,604 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=760893.3333333334, ans=0.125 2023-09-30 15:51:37,830 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.26 vs. limit=15.0 2023-09-30 15:51:38,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:51:38,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 15:51:38,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:51:40,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:41,923 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:51:43,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:51:45,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:45,395 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=760960.0, ans=0.125 2023-09-30 15:51:48,520 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=760960.0, ans=0.025 2023-09-30 15:51:51,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:51:54,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:57,746 INFO [train.py:1039] (0/4) Epoch 22, batch 2600, loss[loss=0.1831, simple_loss=0.2561, pruned_loss=0.0551, over 23969.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2493, pruned_loss=0.04923, over 4704607.78 frames. ], batch size: 80, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:51:57,844 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 15:51:59,458 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 15:51:59,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:51:59,538 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 15:51:59,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 15:52:01,044 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 15:52:02,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:52:02,756 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 15:52:02,971 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=761026.6666666666, ans=0.125 2023-09-30 15:52:04,336 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 15:52:05,884 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 15:52:09,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:52:12,842 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 15:52:14,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 15:52:15,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:52:15,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 15:52:18,908 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 15:52:18,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 15:52:26,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:26,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:28,081 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:28,082 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 15:52:29,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:52:31,495 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=761160.0, ans=0.125 2023-09-30 15:52:34,440 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.852e+02 2.092e+02 2.401e+02 4.337e+02, threshold=4.185e+02, percent-clipped=2.0 2023-09-30 15:52:37,587 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 15:52:43,768 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:45,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:45,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 15:52:47,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:52:47,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:47,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 15:52:49,751 WARNING [train.py:1197] (0/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:52:49,896 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=761226.6666666666, ans=0.0 2023-09-30 15:52:51,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:52:51,450 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=761226.6666666666, ans=0.125 2023-09-30 15:52:52,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:54,677 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=761226.6666666666, ans=0.125 2023-09-30 15:52:57,195 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 15:52:57,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:58,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:53:02,101 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=761293.3333333334, ans=0.0 2023-09-30 15:53:04,906 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:53:05,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:53:05,058 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 15:53:06,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:53:08,666 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:10,531 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:13,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=761293.3333333334, ans=0.125 2023-09-30 15:53:15,817 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=761293.3333333334, ans=15.0 2023-09-30 15:53:16,532 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 15:53:18,066 INFO [train.py:1039] (0/4) Epoch 22, batch 2650, loss[loss=0.158, simple_loss=0.2312, pruned_loss=0.04244, over 20071.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.25, pruned_loss=0.0494, over 4709295.20 frames. ], batch size: 44, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:53:18,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:20,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:53:24,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 15:53:24,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:24,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=761360.0, ans=0.125 2023-09-30 15:53:26,572 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:53:27,928 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 15:53:27,945 WARNING [train.py:1197] (0/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:53:29,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:31,206 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:53:32,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:35,882 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:53:36,006 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 15:53:36,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:53:37,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:53:39,303 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=761426.6666666666, ans=0.0 2023-09-30 15:53:40,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 15:53:42,584 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 15:53:42,891 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=761426.6666666666, ans=0.05 2023-09-30 15:53:44,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:48,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 15:53:48,664 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=761426.6666666666, ans=0.2 2023-09-30 15:53:49,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:53:49,974 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 15:53:54,535 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:54,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:53:54,614 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:56,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:00,470 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 15:54:00,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 15:54:04,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:09,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 15:54:09,671 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:09,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:11,225 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:11,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:11,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:12,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:14,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:14,716 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:54:16,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:54:18,160 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:54:19,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:19,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:54:20,670 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.35 vs. limit=6.0 2023-09-30 15:54:21,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:22,912 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:22,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:54:23,158 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=761626.6666666666, ans=0.04949747468305833 2023-09-30 15:54:26,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:27,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:54:27,576 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:29,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 15:54:31,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:31,670 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=761626.6666666666, ans=0.125 2023-09-30 15:54:34,824 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:35,190 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=761626.6666666666, ans=0.0 2023-09-30 15:54:36,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:37,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:38,140 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=761626.6666666666, ans=0.0 2023-09-30 15:54:39,441 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:39,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:41,053 INFO [train.py:1039] (0/4) Epoch 22, batch 2700, loss[loss=0.188, simple_loss=0.2686, pruned_loss=0.05371, over 23939.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.251, pruned_loss=0.04946, over 4716319.47 frames. ], batch size: 86, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:54:42,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:54:42,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 15:54:44,372 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:54:47,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 15:54:47,639 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:54:50,316 WARNING [train.py:1197] (0/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:50,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:50,397 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:50,542 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:54:52,459 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:52,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:54:52,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:54:52,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 15:54:52,653 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:54:54,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:55,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:54:57,222 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:01,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:55:01,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 15:55:01,311 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=761760.0, ans=0.0 2023-09-30 15:55:02,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:07,387 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=761760.0, ans=0.125 2023-09-30 15:55:08,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:55:08,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:14,894 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:55:14,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:55:14,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:55:16,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:55:19,390 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.905e+02 2.101e+02 2.408e+02 3.392e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-30 15:55:19,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:21,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=761826.6666666666, ans=0.125 2023-09-30 15:55:22,692 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:22,705 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:55:22,726 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:55:29,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:29,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:55:35,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:55:37,822 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:55:41,503 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:55:41,506 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:43,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:45,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:46,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:47,026 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:48,590 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:48,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:55:52,936 WARNING [train.py:1197] (0/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:54,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:54,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:57,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 15:55:57,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:57,795 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=761960.0, ans=0.125 2023-09-30 15:55:57,931 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=761960.0, ans=0.1 2023-09-30 15:55:59,212 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:55:59,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 15:56:01,337 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 15:56:02,836 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:04,254 INFO [train.py:1039] (0/4) Epoch 22, batch 2750, loss[loss=0.1869, simple_loss=0.2633, pruned_loss=0.05527, over 23717.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2508, pruned_loss=0.04968, over 4705728.72 frames. ], batch size: 85, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:56:05,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:05,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:07,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:09,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:56:09,386 WARNING [train.py:1197] (0/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:09,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=762026.6666666666, ans=0.05 2023-09-30 15:56:14,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:14,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:56:15,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:56:16,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:16,383 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 15:56:16,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:56:16,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:19,966 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=762093.3333333334, ans=0.1 2023-09-30 15:56:22,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 15:56:24,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:56:24,223 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:25,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:56:25,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:56:27,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:28,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:56:30,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:30,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:31,937 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=762093.3333333334, ans=0.0 2023-09-30 15:56:36,277 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:56:36,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:56:36,387 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:56:38,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:39,965 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:56:47,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:51,109 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:56:51,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:55,624 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:55,641 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:56:55,706 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:57:01,861 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:57:01,920 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:57:01,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 15:57:07,745 WARNING [train.py:1197] (0/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:09,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 15:57:16,059 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:57:18,183 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:57:18,224 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 15:57:19,746 WARNING [train.py:1197] (0/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:57:19,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:57:21,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 15:57:21,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:57:25,588 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 15:57:26,875 INFO [train.py:1039] (0/4) Epoch 22, batch 2800, loss[loss=0.1691, simple_loss=0.2116, pruned_loss=0.06326, over 19409.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2484, pruned_loss=0.04904, over 4686862.83 frames. ], batch size: 388, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:57:26,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:27,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:57:28,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 15:57:28,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:28,569 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:30,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:31,559 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 15:57:31,560 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 15:57:34,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:37,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:57:37,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:57:40,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:57:42,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 15:57:44,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:57:45,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 15:57:49,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:49,274 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:57:49,290 WARNING [train.py:1197] (0/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:57:54,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:57:54,492 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:54,498 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:57:55,011 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.20 vs. limit=10.0 2023-09-30 15:57:57,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:03,728 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=762493.3333333334, ans=0.1 2023-09-30 15:58:03,778 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=762493.3333333334, ans=0.2 2023-09-30 15:58:04,781 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.836e+02 2.011e+02 2.397e+02 3.435e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 15:58:06,433 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:58:08,002 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:09,642 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:11,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:58:11,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:15,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:16,022 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 15:58:17,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:18,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:18,972 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:58:19,216 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=762560.0, ans=0.1 2023-09-30 15:58:20,797 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=762560.0, ans=0.125 2023-09-30 15:58:24,149 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:26,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:29,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:31,177 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:58:31,202 WARNING [train.py:1197] (0/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:31,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:58:33,303 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:58:34,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:58:35,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:58:35,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 15:58:35,695 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=762626.6666666666, ans=0.125 2023-09-30 15:58:36,248 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.84 vs. limit=10.0 2023-09-30 15:58:36,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:38,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:38,468 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:40,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 15:58:40,346 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=762626.6666666666, ans=0.125 2023-09-30 15:58:41,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:41,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:58:43,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:58:43,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 15:58:43,699 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=762626.6666666666, ans=0.125 2023-09-30 15:58:45,373 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=762626.6666666666, ans=0.125 2023-09-30 15:58:49,603 INFO [train.py:1039] (0/4) Epoch 22, batch 2850, loss[loss=0.1849, simple_loss=0.265, pruned_loss=0.05238, over 24015.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2482, pruned_loss=0.04879, over 4685187.74 frames. ], batch size: 80, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:58:49,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:49,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:58:51,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:58:51,454 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=762693.3333333334, ans=0.125 2023-09-30 15:58:54,031 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:58:54,363 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=762693.3333333334, ans=0.5 2023-09-30 15:58:56,508 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.75 vs. limit=15.0 2023-09-30 15:58:57,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:58:59,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:59,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:59:02,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:03,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:59:04,482 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:59:05,960 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 15:59:11,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 15:59:11,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:13,453 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 15:59:14,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:16,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 15:59:18,015 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 15:59:21,037 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:29,954 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=762826.6666666666, ans=0.125 2023-09-30 15:59:32,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:32,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:34,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:59:35,377 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=762826.6666666666, ans=0.07 2023-09-30 15:59:36,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:59:36,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:59:36,531 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:59:38,064 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:59:39,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 15:59:41,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:59:41,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:59:41,338 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:42,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:46,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:47,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:49,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:51,138 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=762893.3333333334, ans=0.125 2023-09-30 15:59:52,221 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:52,580 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=762893.3333333334, ans=0.0 2023-09-30 15:59:53,728 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:59:53,827 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:55,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:58,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:00:01,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:00:03,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 16:00:05,244 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 16:00:06,787 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:00:06,880 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:06,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 16:00:06,992 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:00:08,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:08,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:10,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:00:10,542 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 16:00:10,609 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 16:00:10,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:10,721 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:11,976 INFO [train.py:1039] (0/4) Epoch 22, batch 2900, loss[loss=0.1733, simple_loss=0.2485, pruned_loss=0.04907, over 23571.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2473, pruned_loss=0.04829, over 4681451.28 frames. ], batch size: 149, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 16:00:15,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:16,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:16,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:00:18,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 16:00:23,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:23,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 16:00:25,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 16:00:26,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:00:26,752 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:00:28,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:28,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:00:32,968 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:33,052 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:36,627 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:00:38,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 16:00:38,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:00:39,621 WARNING [train.py:1197] (0/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:43,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 16:00:43,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 16:00:46,486 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:46,490 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 16:00:46,526 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:00:50,817 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.778e+02 1.944e+02 2.291e+02 4.038e+02, threshold=3.888e+02, percent-clipped=1.0 2023-09-30 16:00:50,948 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:00:50,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:52,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:53,068 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=763160.0, ans=0.125 2023-09-30 16:00:54,776 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:58,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:01:01,629 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:03,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 16:01:03,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 16:01:03,252 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:01:07,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:01:09,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 16:01:11,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:01:16,836 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:01:18,721 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=763293.3333333334, ans=0.125 2023-09-30 16:01:26,039 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:01:26,079 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:01:28,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 16:01:32,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:32,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 16:01:32,586 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:32,674 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:01:34,060 INFO [train.py:1039] (0/4) Epoch 22, batch 2950, loss[loss=0.1585, simple_loss=0.2476, pruned_loss=0.03473, over 24656.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2491, pruned_loss=0.04861, over 4687726.46 frames. ], batch size: 65, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:01:37,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:40,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 16:01:42,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:42,084 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:43,643 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:01:47,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:01:47,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 16:01:47,435 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=763360.0, ans=0.95 2023-09-30 16:01:48,794 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 16:01:48,902 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:01:48,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:55,756 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:01:58,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:00,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:02:01,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:05,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:05,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:02:09,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:02:12,304 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 16:02:17,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 16:02:17,130 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 16:02:17,260 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:02:18,867 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 16:02:19,018 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=763493.3333333334, ans=0.2 2023-09-30 16:02:21,734 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 16:02:21,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:21,869 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:02:21,869 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 16:02:21,876 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:02:25,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 16:02:25,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:25,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:02:28,817 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:30,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:02:30,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:30,521 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 16:02:32,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:32,078 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 16:02:40,078 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:42,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:02:42,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 16:02:42,184 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:02:43,789 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 16:02:48,194 WARNING [train.py:1197] (0/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:02:49,743 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:51,183 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:02:51,391 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:51,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:02:52,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:02:55,886 INFO [train.py:1039] (0/4) Epoch 22, batch 3000, loss[loss=0.1802, simple_loss=0.2727, pruned_loss=0.04389, over 24665.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2496, pruned_loss=0.04883, over 4694022.75 frames. ], batch size: 73, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:02:55,887 INFO [train.py:1062] (0/4) Computing validation loss 2023-09-30 16:03:10,441 INFO [train.py:1071] (0/4) Epoch 22, validation: loss=0.3133, simple_loss=0.2748, pruned_loss=0.1759, over 1125622.00 frames. 2023-09-30 16:03:10,442 INFO [train.py:1072] (0/4) Maximum memory allocated so far is 20954MB 2023-09-30 16:03:10,505 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:10,514 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:03:10,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:03:10,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:03:12,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:03:12,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:12,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 16:03:14,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:18,136 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:03:18,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:03:21,332 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 16:03:22,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 16:03:23,131 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=763693.3333333334, ans=0.125 2023-09-30 16:03:24,457 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:03:25,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:03:25,859 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 16:03:27,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:29,658 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=763760.0, ans=0.125 2023-09-30 16:03:34,535 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:03:43,865 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:03:50,839 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.854e+02 2.053e+02 2.240e+02 3.574e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 16:03:52,487 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 16:03:52,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:03:54,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=763826.6666666666, ans=0.125 2023-09-30 16:03:55,724 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:03:57,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:57,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:03:59,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:03:59,999 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 16:04:00,241 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 16:04:02,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:04:03,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:04:05,396 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:04:07,239 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:07,346 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:07,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:04:10,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:04:11,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:04:11,884 WARNING [train.py:1197] (0/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:04:15,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:16,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 16:04:18,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:04:18,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:18,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:04:23,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:23,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=763960.0, ans=0.1 2023-09-30 16:04:25,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:25,403 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 16:04:25,470 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 16:04:27,574 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:04:27,624 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 16:04:29,069 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:04:30,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 16:04:32,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:04:32,519 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:04:32,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 16:04:32,676 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 16:04:32,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:04:32,938 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=764026.6666666666, ans=0.05 2023-09-30 16:04:34,059 INFO [train.py:1039] (0/4) Epoch 22, batch 3050, loss[loss=0.1824, simple_loss=0.2478, pruned_loss=0.05847, over 23405.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2499, pruned_loss=0.04841, over 4703690.81 frames. ], batch size: 285, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:04:34,299 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:04:36,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:36,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:04:36,529 WARNING [train.py:1197] (0/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:38,033 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:04:40,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 16:04:44,708 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:04:46,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:04:46,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:04:49,584 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:49,983 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=764093.3333333334, ans=0.125 2023-09-30 16:04:53,152 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 16:04:59,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 16:04:59,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 16:05:01,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:04,548 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:05:04,874 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:05:07,628 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:07,640 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:07,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:11,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:12,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:05:12,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:12,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:12,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:16,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:18,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:21,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:21,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 16:05:22,635 WARNING [train.py:1197] (0/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:22,657 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:05:25,839 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:05:25,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:05:27,331 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:05:27,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:33,988 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:35,435 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:40,828 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:40,901 WARNING [train.py:1197] (0/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:05:40,907 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:41,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=764293.3333333334, ans=0.125 2023-09-30 16:05:42,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:44,653 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:05:44,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:46,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 16:05:49,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:49,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:49,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 16:05:51,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:56,005 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:57,366 INFO [train.py:1039] (0/4) Epoch 22, batch 3100, loss[loss=0.1794, simple_loss=0.2469, pruned_loss=0.05595, over 23796.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2496, pruned_loss=0.04869, over 4709199.75 frames. ], batch size: 164, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:05:58,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:06:00,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:06:03,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 16:06:04,403 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.32 vs. limit=6.0 2023-09-30 16:06:07,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 16:06:08,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 16:06:09,542 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=764360.0, ans=0.125 2023-09-30 16:06:10,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:06:13,918 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:06:13,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:17,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:06:20,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:25,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 16:06:26,582 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.04 vs. limit=15.0 2023-09-30 16:06:31,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:06:33,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:33,313 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:06:34,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:06:34,774 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:06:34,986 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:06:35,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 16:06:35,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:06:36,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:38,093 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.888e+02 2.068e+02 2.233e+02 3.046e+02, threshold=4.136e+02, percent-clipped=0.0 2023-09-30 16:06:38,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 16:06:38,622 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=764493.3333333334, ans=0.1 2023-09-30 16:06:40,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:06:42,992 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=764493.3333333334, ans=0.125 2023-09-30 16:06:45,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:06:45,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 16:06:47,038 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 16:06:47,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:47,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:50,896 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:06:50,934 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:52,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:06:52,566 WARNING [train.py:1197] (0/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:06:52,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:06:55,399 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:06:55,456 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:06:55,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:55,477 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:07:00,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:07:02,138 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 16:07:03,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:07:05,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 16:07:05,756 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.60 vs. limit=22.5 2023-09-30 16:07:06,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:06,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:06,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 16:07:17,773 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=764626.6666666666, ans=0.0 2023-09-30 16:07:19,009 WARNING [train.py:1197] (0/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 16:07:20,440 INFO [train.py:1039] (0/4) Epoch 22, batch 3150, loss[loss=0.152, simple_loss=0.2357, pruned_loss=0.03417, over 24672.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2485, pruned_loss=0.0488, over 4700235.39 frames. ], batch size: 65, lr: 4.66e-03, grad_scale: 8.0 2023-09-30 16:07:20,934 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=764693.3333333334, ans=0.125 2023-09-30 16:07:22,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:22,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:23,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:07:23,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:07:24,051 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 16:07:24,722 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.78 vs. limit=15.0 2023-09-30 16:07:26,040 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:26,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:07:27,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 16:07:29,968 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.07 vs. limit=22.5 2023-09-30 16:07:30,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:32,436 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 16:07:35,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 16:07:35,522 WARNING [train.py:1197] (0/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:07:37,108 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 16:07:37,459 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=764760.0, ans=0.2 2023-09-30 16:07:38,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 16:07:40,159 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 16:07:40,367 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=764760.0, ans=0.04949747468305833 2023-09-30 16:07:41,631 WARNING [train.py:1197] (0/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 16:07:41,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 16:07:41,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:41,679 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:07:43,300 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:43,667 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=764760.0, ans=0.0 2023-09-30 16:07:44,847 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 16:07:45,026 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=764760.0, ans=0.0 2023-09-30 16:07:46,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:46,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:48,461 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:07:50,589 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:07:55,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 16:07:55,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:07:55,356 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=764826.6666666666, ans=0.0 2023-09-30 16:07:58,160 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:08:00,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:08:00,199 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 16:08:03,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 16:08:05,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:08:05,326 WARNING [train.py:1197] (0/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:08:05,347 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:08:06,781 WARNING [train.py:1197] (0/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:06,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:08:08,863 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.91 vs. limit=6.0 2023-09-30 16:08:09,649 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:08:09,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:08:09,818 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 16:08:10,201 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=764893.3333333334, ans=0.125 2023-09-30 16:08:11,230 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:08:11,245 WARNING [train.py:1197] (0/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:11,550 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=764893.3333333334, ans=0.125 2023-09-30 16:08:12,804 WARNING [train.py:1197] (0/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:08:12,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:08:12,949 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 16:08:14,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:17,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 16:08:17,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:17,618 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 16:08:19,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 16:08:20,678 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:08:20,731 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:22,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 16:08:24,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 16:08:24,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:26,506 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:08:27,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:29,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:08:34,518 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:08:36,036 WARNING [train.py:1197] (0/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:38,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 16:08:41,417 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=765026.6666666666, ans=0.1 2023-09-30 16:08:42,511 INFO [train.py:1039] (0/4) Epoch 22, batch 3200, loss[loss=0.1718, simple_loss=0.2565, pruned_loss=0.04357, over 23680.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.248, pruned_loss=0.04842, over 4694094.09 frames. ], batch size: 85, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:08:44,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:08:44,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 16:08:48,685 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:48,852 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:08:50,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 16:08:53,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:57,215 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:08:59,088 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=765093.3333333334, ans=0.125 2023-09-30 16:09:00,410 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:09:11,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:09:15,069 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=765160.0, ans=0.1 2023-09-30 16:09:17,072 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.06 vs. limit=12.0 2023-09-30 16:09:18,064 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:09:20,659 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 16:09:22,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:09:23,567 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.875e+02 2.082e+02 2.411e+02 3.393e+02, threshold=4.163e+02, percent-clipped=0.0 2023-09-30 16:09:25,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 16:09:25,447 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:09:30,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:09:30,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:09:31,304 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.60 vs. limit=15.0 2023-09-30 16:09:32,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:09:37,334 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 16:09:38,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:09:40,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 16:09:44,327 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 16:09:47,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:09:53,886 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:53,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:09:55,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:55,415 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 16:09:55,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:10:00,213 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:00,649 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=765293.3333333334, ans=0.1 2023-09-30 16:10:01,843 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 16:10:01,922 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 16:10:03,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 16:10:04,739 INFO [train.py:1039] (0/4) Epoch 22, batch 3250, loss[loss=0.1719, simple_loss=0.2481, pruned_loss=0.04781, over 24595.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2478, pruned_loss=0.04842, over 4697895.41 frames. ], batch size: 60, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:10:04,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 16:10:07,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:10:10,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:10:10,223 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 16:10:10,285 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:10,289 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:12,421 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 16:10:14,194 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=765360.0, ans=0.1 2023-09-30 16:10:17,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:10:19,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:27,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:10:27,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 16:10:28,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:29,805 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.79 vs. limit=22.5 2023-09-30 16:10:30,278 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:10:30,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:31,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:31,964 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:10:35,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,176 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:10:35,242 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:35,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,305 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:10:37,116 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=765493.3333333334, ans=0.1 2023-09-30 16:10:39,977 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:43,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:45,065 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:45,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:46,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:46,700 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:47,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:10:52,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 16:10:52,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:52,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:10:53,133 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.51 vs. limit=15.0 2023-09-30 16:10:54,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:56,208 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:11:02,174 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:11:03,967 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=765560.0, ans=0.2 2023-09-30 16:11:09,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:09,944 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:09,945 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 16:11:09,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:11:09,971 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:11:10,166 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=765626.6666666666, ans=0.0 2023-09-30 16:11:11,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:15,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 16:11:15,087 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 16:11:15,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:11:16,761 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:16,849 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:18,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:11:18,398 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:22,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:22,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:25,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 16:11:25,086 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:28,030 INFO [train.py:1039] (0/4) Epoch 22, batch 3300, loss[loss=0.1552, simple_loss=0.2274, pruned_loss=0.04144, over 24305.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2486, pruned_loss=0.04852, over 4711082.07 frames. ], batch size: 56, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:11:28,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:11:28,098 WARNING [train.py:1197] (0/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 16:11:30,472 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:32,461 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 16:11:34,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 16:11:34,412 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=765693.3333333334, ans=0.2 2023-09-30 16:11:35,668 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 16:11:36,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:40,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:40,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:11:41,685 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:43,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:11:43,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:11:46,934 WARNING [train.py:1197] (0/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:48,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:53,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 16:11:53,118 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:11:53,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:56,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:57,481 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 16:11:57,697 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=765760.0, ans=0.0 2023-09-30 16:11:58,996 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:01,013 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:12:01,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:12:01,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:01,177 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 16:12:06,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:06,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:12:07,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:07,931 WARNING [train.py:1197] (0/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 16:12:09,732 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.926e+02 2.080e+02 2.384e+02 3.230e+02, threshold=4.160e+02, percent-clipped=0.0 2023-09-30 16:12:09,953 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 16:12:09,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:11,457 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:12:13,141 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 16:12:14,701 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 16:12:14,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:12:16,411 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 16:12:19,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:23,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:12:23,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:12:24,923 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=765893.3333333334, ans=0.09899494936611666 2023-09-30 16:12:27,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:27,798 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:27,801 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:29,200 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:12:32,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:12:32,181 WARNING [train.py:1197] (0/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:33,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:12:33,965 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=765960.0, ans=0.1 2023-09-30 16:12:37,206 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 16:12:37,339 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 16:12:38,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:12:38,979 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:12:38,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:41,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:41,131 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:42,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:12:44,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:44,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:12:46,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:46,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:12:46,629 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=765960.0, ans=0.125 2023-09-30 16:12:49,380 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 16:12:49,446 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:49,704 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=766026.6666666666, ans=0.07 2023-09-30 16:12:50,910 INFO [train.py:1039] (0/4) Epoch 22, batch 3350, loss[loss=0.1889, simple_loss=0.2544, pruned_loss=0.0617, over 22677.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2496, pruned_loss=0.04877, over 4710338.33 frames. ], batch size: 322, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:12:51,065 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:53,956 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:12:54,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:55,493 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:59,020 WARNING [train.py:1197] (0/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:59,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:00,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:13:02,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:03,763 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:13:04,075 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=766026.6666666666, ans=0.2 2023-09-30 16:13:05,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:08,442 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.53 vs. limit=10.0 2023-09-30 16:13:08,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:13:10,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:10,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:13:12,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 16:13:13,984 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 16:13:14,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:17,239 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=766093.3333333334, ans=0.035 2023-09-30 16:13:18,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 16:13:18,539 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 16:13:18,697 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:13:20,651 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:13:20,895 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=766093.3333333334, ans=0.125 2023-09-30 16:13:22,123 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:22,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 16:13:22,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:22,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:13:23,922 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=766160.0, ans=0.125 2023-09-30 16:13:25,137 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:25,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:25,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:26,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:13:28,641 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=766160.0, ans=0.125 2023-09-30 16:13:29,010 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.07 vs. limit=15.0 2023-09-30 16:13:31,344 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:35,022 WARNING [train.py:1197] (0/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:35,111 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:39,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:13:41,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:42,816 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:42,840 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:45,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:46,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 16:13:46,687 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:13:46,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 16:13:48,170 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:13:48,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 16:13:49,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:51,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:58,473 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:59,908 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 16:13:59,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:00,105 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:14:00,390 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=766293.3333333334, ans=0.2 2023-09-30 16:14:01,634 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:14:04,928 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:08,526 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 16:14:08,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:14:08,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:14:10,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:11,654 WARNING [train.py:1197] (0/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 16:14:13,632 INFO [train.py:1039] (0/4) Epoch 22, batch 3400, loss[loss=0.1776, simple_loss=0.2553, pruned_loss=0.04994, over 23580.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.251, pruned_loss=0.04963, over 4692961.19 frames. ], batch size: 120, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:14:13,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:14:13,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 16:14:15,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:17,186 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:17,282 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:14:18,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:14:18,798 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 16:14:24,802 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 16:14:24,820 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 16:14:24,847 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:14:28,612 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:28,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:30,173 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:31,687 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:14:36,320 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=766426.6666666666, ans=0.1 2023-09-30 16:14:36,943 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.83 vs. limit=15.0 2023-09-30 16:14:37,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:14:39,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 16:14:42,461 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=766426.6666666666, ans=0.125 2023-09-30 16:14:44,348 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:14:47,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:47,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:47,709 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=766493.3333333334, ans=0.125 2023-09-30 16:14:49,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 16:14:53,685 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=766493.3333333334, ans=0.125 2023-09-30 16:14:56,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:14:57,971 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.853e+02 2.034e+02 2.228e+02 2.939e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 16:14:58,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 16:15:04,995 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 16:15:07,916 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:07,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:09,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:15:09,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:15:09,803 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=766560.0, ans=0.125 2023-09-30 16:15:11,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:15:16,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:15:16,165 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:15:21,596 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:23,349 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 16:15:24,946 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=766626.6666666666, ans=0.125 2023-09-30 16:15:25,659 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.83 vs. limit=10.0 2023-09-30 16:15:28,673 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:15:33,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 16:15:36,871 INFO [train.py:1039] (0/4) Epoch 22, batch 3450, loss[loss=0.1584, simple_loss=0.2182, pruned_loss=0.04929, over 22644.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2497, pruned_loss=0.04941, over 4699969.51 frames. ], batch size: 322, lr: 4.65e-03, grad_scale: 4.0 2023-09-30 16:15:36,997 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 16:15:37,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:40,019 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:15:40,047 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 16:15:40,169 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:43,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:15:51,591 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:15:53,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:15:54,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:15:54,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:56,900 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:00,935 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.22 vs. limit=15.0 2023-09-30 16:16:03,713 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 16:16:10,281 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 16:16:10,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:16:10,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:16:13,342 WARNING [train.py:1197] (0/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:18,092 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 16:16:19,593 WARNING [train.py:1197] (0/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:16:22,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:22,957 WARNING [train.py:1197] (0/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:16:24,431 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:16:26,765 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:16:28,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 16:16:28,450 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:16:30,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:32,487 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=766893.3333333334, ans=0.0 2023-09-30 16:16:33,652 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:16:36,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 16:16:41,679 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:16:48,515 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:16:50,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:50,456 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=766960.0, ans=0.125 2023-09-30 16:16:53,205 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:16:56,419 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:57,866 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:57,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:16:59,263 INFO [train.py:1039] (0/4) Epoch 22, batch 3500, loss[loss=0.1801, simple_loss=0.2462, pruned_loss=0.05696, over 23917.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.249, pruned_loss=0.04892, over 4715032.43 frames. ], batch size: 195, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:16:59,330 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:17:04,684 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:07,711 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:17:09,633 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 16:17:11,107 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:17:13,474 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:17:16,665 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:16,688 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 16:17:21,792 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:17:21,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:17:23,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:17:23,598 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:24,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:17:25,059 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:26,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:26,558 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 16:17:28,697 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.87 vs. limit=15.0 2023-09-30 16:17:29,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:30,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:17:32,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:36,266 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:37,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 16:17:37,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:39,389 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:39,694 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=767160.0, ans=0.1 2023-09-30 16:17:42,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:17:44,404 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.347e+02 1.810e+02 1.991e+02 2.339e+02 3.631e+02, threshold=3.981e+02, percent-clipped=0.0 2023-09-30 16:17:44,537 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:46,100 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:17:46,128 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:47,698 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 16:17:47,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 16:17:49,961 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 16:17:50,257 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=767226.6666666666, ans=0.2 2023-09-30 16:17:52,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:53,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:53,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:53,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:17:55,539 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=767226.6666666666, ans=0.0 2023-09-30 16:17:58,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:17:58,410 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:18:03,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:04,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 16:18:04,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 16:18:04,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:07,939 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:10,024 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:11,615 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:13,220 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 16:18:13,332 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:14,813 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:18:16,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 16:18:17,176 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=767293.3333333334, ans=0.0 2023-09-30 16:18:18,438 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 16:18:20,050 WARNING [train.py:1197] (0/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:21,943 WARNING [train.py:1197] (0/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:21,970 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:22,023 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:24,015 INFO [train.py:1039] (0/4) Epoch 22, batch 3550, loss[loss=0.1773, simple_loss=0.2697, pruned_loss=0.04243, over 24441.00 frames. ], tot_loss[loss=0.1718, simple_loss=0.2472, pruned_loss=0.04817, over 4706121.28 frames. ], batch size: 69, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:18:25,777 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:18:34,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:34,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 16:18:39,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:39,555 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:18:43,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:43,353 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:18:44,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:18:47,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:49,323 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:18:49,415 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:50,796 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:18:50,931 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:18:58,374 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:18:59,820 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:19:00,021 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:00,028 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:01,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:19:01,513 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 16:19:01,530 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:03,071 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:04,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:19:09,934 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.13 vs. limit=15.0 2023-09-30 16:19:10,704 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:10,830 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:19:12,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:12,790 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.02 vs. limit=15.0 2023-09-30 16:19:15,879 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 16:19:15,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:19:17,416 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 16:19:17,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:19,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:19:19,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:19:23,729 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 16:19:24,419 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.99 vs. limit=6.0 2023-09-30 16:19:25,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:30,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:32,632 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 16:19:34,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:34,669 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=767626.6666666666, ans=0.1 2023-09-30 16:19:37,369 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:38,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 16:19:43,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=767693.3333333334, ans=0.09899494936611666 2023-09-30 16:19:44,985 INFO [train.py:1039] (0/4) Epoch 22, batch 3600, loss[loss=0.1789, simple_loss=0.2499, pruned_loss=0.05392, over 23708.00 frames. ], tot_loss[loss=0.1712, simple_loss=0.2471, pruned_loss=0.04769, over 4717847.22 frames. ], batch size: 256, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:19:46,505 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 16:19:46,571 WARNING [train.py:1197] (0/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:19:46,669 WARNING [train.py:1197] (0/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:19:48,292 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:48,645 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=767693.3333333334, ans=0.125 2023-09-30 16:19:50,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:51,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:19:55,124 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:56,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:58,115 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:19:58,212 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:19:58,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:59,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 16:20:02,077 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:20:04,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:06,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:09,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:11,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:20:11,181 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:20:11,219 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 16:20:12,721 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:15,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:17,227 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:20:18,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:22,006 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:24,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:25,636 WARNING [train.py:1197] (0/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 16:20:30,137 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.855e+02 2.093e+02 2.539e+02 3.867e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 16:20:31,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:20:33,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:20:33,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 16:20:39,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:20:45,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:47,544 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:51,198 WARNING [train.py:1197] (0/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:20:51,222 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:20:51,231 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 16:20:52,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 16:20:54,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 16:20:57,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:57,875 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:20:58,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 16:20:59,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:20:59,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:20:59,555 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:21:01,401 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 16:21:01,705 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:21:04,135 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 16:21:06,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:21:06,742 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 16:21:08,381 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=768026.6666666666, ans=0.1 2023-09-30 16:21:09,573 INFO [train.py:1039] (0/4) Epoch 22, batch 3650, loss[loss=0.1712, simple_loss=0.265, pruned_loss=0.0387, over 24454.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.249, pruned_loss=0.04872, over 4701590.51 frames. ], batch size: 69, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:21:13,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 16:21:13,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:21:20,004 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 16:21:21,594 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 16:21:26,306 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:21:26,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:21:27,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:21:31,156 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:21:31,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:21:32,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 16:21:34,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:21:34,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:21:34,314 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 16:21:36,465 WARNING [train.py:1197] (0/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:21:36,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:21:36,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:39,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:21:43,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 16:21:44,704 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 16:21:46,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:21:47,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 16:21:49,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:21:49,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:21:56,775 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:21:58,270 WARNING [train.py:1197] (0/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:58,288 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:21:59,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:22:01,303 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:22:04,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:22:04,816 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=768226.6666666666, ans=0.125 2023-09-30 16:22:06,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:06,298 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=768226.6666666666, ans=0.125 2023-09-30 16:22:07,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:07,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:22:10,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:22:11,093 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.96 vs. limit=10.0 2023-09-30 16:22:12,401 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:22:12,500 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:14,703 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.34 vs. limit=6.0 2023-09-30 16:22:15,766 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=768293.3333333334, ans=0.1 2023-09-30 16:22:18,715 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 16:22:22,366 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:22,396 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:25,744 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:22:25,819 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:27,975 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:22:29,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:31,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 16:22:31,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:32,623 INFO [train.py:1039] (0/4) Epoch 22, batch 3700, loss[loss=0.1954, simple_loss=0.2705, pruned_loss=0.06016, over 23921.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2496, pruned_loss=0.04881, over 4709585.40 frames. ], batch size: 86, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:22:32,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:22:35,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:36,137 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=768360.0, ans=0.0 2023-09-30 16:22:37,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:22:40,405 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:40,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 16:22:40,421 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:40,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:22:42,001 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:22:46,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:22:48,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:48,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:50,569 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:22:51,927 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:52,036 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:22:55,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:55,525 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=768426.6666666666, ans=0.125 2023-09-30 16:22:57,103 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 16:23:02,710 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=768426.6666666666, ans=0.125 2023-09-30 16:23:05,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:23:07,214 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:23:07,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:23:07,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 16:23:07,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:12,029 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:12,168 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 16:23:13,691 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:15,165 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:23:16,433 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.838e+02 2.020e+02 2.456e+02 4.416e+02, threshold=4.040e+02, percent-clipped=2.0 2023-09-30 16:23:18,139 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:18,184 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:23:21,716 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:23:24,204 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.36 vs. limit=15.0 2023-09-30 16:23:26,387 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:26,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 16:23:27,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:23:27,894 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 16:23:32,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:23:32,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:23:36,616 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:36,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 16:23:40,544 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:23:40,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:23:40,597 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:42,010 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:45,164 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:46,586 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 16:23:46,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 16:23:48,278 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:23:48,300 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:23:48,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:23:49,955 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:23:51,741 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=768626.6666666666, ans=0.125 2023-09-30 16:23:53,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:54,771 INFO [train.py:1039] (0/4) Epoch 22, batch 3750, loss[loss=0.1761, simple_loss=0.2517, pruned_loss=0.05025, over 23173.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2507, pruned_loss=0.04879, over 4712302.25 frames. ], batch size: 93, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:23:54,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:23:55,348 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=768693.3333333334, ans=0.125 2023-09-30 16:23:56,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:23:59,317 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 16:24:00,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:24:02,553 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:24:04,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 16:24:04,134 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:24:04,403 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=768693.3333333334, ans=0.125 2023-09-30 16:24:05,626 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:05,792 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:09,146 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:14,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:16,315 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=768760.0, ans=0.125 2023-09-30 16:24:17,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:24:18,155 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.44 vs. limit=22.5 2023-09-30 16:24:18,233 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=768760.0, ans=10.0 2023-09-30 16:24:19,055 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:24:20,741 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:24:24,966 WARNING [train.py:1197] (0/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:25,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 16:24:26,599 WARNING [train.py:1197] (0/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:28,121 WARNING [train.py:1197] (0/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:28,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:31,910 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 16:24:35,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 16:24:35,250 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=768826.6666666666, ans=0.5 2023-09-30 16:24:36,476 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:37,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:38,936 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.62 vs. limit=15.0 2023-09-30 16:24:40,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:46,655 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:48,180 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 16:24:51,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 16:24:53,166 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:56,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:56,371 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:24:58,134 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=768960.0, ans=0.0 2023-09-30 16:25:00,929 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:25:06,142 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:25:07,718 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:25:09,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:25:10,915 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:25:14,053 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:25:16,048 INFO [train.py:1039] (0/4) Epoch 22, batch 3800, loss[loss=0.1819, simple_loss=0.2591, pruned_loss=0.05236, over 23367.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2514, pruned_loss=0.04928, over 4707591.18 frames. ], batch size: 119, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:25:19,346 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=769026.6666666666, ans=0.1 2023-09-30 16:25:23,447 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:25:26,675 WARNING [train.py:1197] (0/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:28,219 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:25:29,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 16:25:30,568 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.27 vs. limit=22.5 2023-09-30 16:25:31,175 WARNING [train.py:1197] (0/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:31,398 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:32,910 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:25:36,501 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 16:25:36,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:36,636 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:25:38,193 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:38,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:25:39,619 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:41,133 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 16:25:44,173 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 16:25:44,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:25:47,362 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:50,968 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:25:51,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:25:53,336 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:25:53,358 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:56,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:56,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:26:00,869 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.761e+02 1.975e+02 2.225e+02 3.481e+02, threshold=3.951e+02, percent-clipped=0.0 2023-09-30 16:26:02,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:26:02,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 16:26:02,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:06,161 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=769226.6666666666, ans=0.0 2023-09-30 16:26:11,063 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:15,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:26:17,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 16:26:19,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 16:26:19,305 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:22,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:24,312 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:25,188 WARNING [train.py:1197] (0/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 16:26:29,705 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 16:26:29,723 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 16:26:29,771 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:32,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:33,329 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=769293.3333333334, ans=0.125 2023-09-30 16:26:39,666 INFO [train.py:1039] (0/4) Epoch 22, batch 3850, loss[loss=0.168, simple_loss=0.2585, pruned_loss=0.03875, over 24647.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2504, pruned_loss=0.04896, over 4705348.30 frames. ], batch size: 68, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:26:39,755 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:26:39,862 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:26:44,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:26:44,644 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 16:26:46,195 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:26:46,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:49,553 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:26:53,206 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:54,810 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:26:56,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 16:27:03,961 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:07,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:27:09,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:09,543 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=769426.6666666666, ans=0.0 2023-09-30 16:27:10,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:27:12,365 WARNING [train.py:1197] (0/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:12,472 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:27:12,560 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:12,580 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:27:14,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:15,808 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:15,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:17,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:27:17,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 16:27:17,563 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 16:27:19,035 WARNING [train.py:1197] (0/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:19,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:22,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:22,284 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:23,676 WARNING [train.py:1197] (0/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 16:27:25,778 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 16:27:26,083 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=769493.3333333334, ans=0.0 2023-09-30 16:27:28,862 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:30,983 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 16:27:32,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:27:39,360 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:39,539 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:27:40,861 WARNING [train.py:1197] (0/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:44,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:46,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 16:27:47,793 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 16:27:50,693 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:51,031 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=769626.6666666666, ans=0.1 2023-09-30 16:27:52,141 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:53,878 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:27:53,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:27:55,312 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,442 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,444 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:27:55,452 WARNING [train.py:1197] (0/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 16:27:56,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:58,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 16:27:58,504 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:58,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:02,238 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:28:03,526 INFO [train.py:1039] (0/4) Epoch 22, batch 3900, loss[loss=0.1823, simple_loss=0.2521, pruned_loss=0.0563, over 23878.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2489, pruned_loss=0.04843, over 4696448.84 frames. ], batch size: 195, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:28:03,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:05,090 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:28:05,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:05,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:28:05,293 WARNING [train.py:1197] (0/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:07,226 WARNING [train.py:1197] (0/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 16:28:07,322 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:11,812 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:13,874 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:13,941 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:28:14,294 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=769693.3333333334, ans=0.0 2023-09-30 16:28:15,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:17,150 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:19,076 WARNING [train.py:1197] (0/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:20,785 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:28:21,149 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=769760.0, ans=0.0 2023-09-30 16:28:22,320 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 16:28:22,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:23,959 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 16:28:24,008 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:25,425 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 16:28:27,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 16:28:30,243 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:31,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:31,731 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:28:33,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:28:39,892 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:41,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:28:46,383 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:28:46,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:28:48,303 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.859e+02 2.113e+02 2.353e+02 3.355e+02, threshold=4.226e+02, percent-clipped=0.0 2023-09-30 16:28:48,404 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:28:55,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:55,249 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:29:02,838 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:29:03,010 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:29:13,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:15,747 WARNING [train.py:1197] (0/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:17,706 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 16:29:17,774 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 16:29:19,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:19,466 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 16:29:21,025 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:29:22,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 16:29:24,714 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=770026.6666666666, ans=0.015 2023-09-30 16:29:25,802 INFO [train.py:1039] (0/4) Epoch 22, batch 3950, loss[loss=0.1835, simple_loss=0.2496, pruned_loss=0.0587, over 23809.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2488, pruned_loss=0.04847, over 4717159.98 frames. ], batch size: 164, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:29:29,664 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:29:31,138 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 16:29:31,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:29:33,494 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.32 vs. limit=22.5 2023-09-30 16:29:34,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:29:37,178 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:29:39,204 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=770026.6666666666, ans=0.125 2023-09-30 16:29:43,212 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 16:29:43,307 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:43,351 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 16:29:44,796 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 16:29:44,838 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:48,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:48,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:29:48,106 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:52,378 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 16:29:54,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:29:55,423 WARNING [train.py:1197] (0/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:55,455 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:29:55,550 WARNING [train.py:1197] (0/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:29:56,963 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:30:07,488 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:30:08,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:30:10,845 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=770160.0, ans=0.2 2023-09-30 16:30:13,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 16:30:19,516 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 16:30:19,521 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 16:30:19,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:30:21,027 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:30:26,995 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=770226.6666666666, ans=0.0 2023-09-30 16:30:29,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:30:29,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:30:31,499 WARNING [train.py:1197] (0/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:30:31,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:30:31,605 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 16:30:37,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:30:39,373 WARNING [train.py:1197] (0/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:30:43,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 16:30:48,435 INFO [train.py:1039] (0/4) Epoch 22, batch 4000, loss[loss=0.1719, simple_loss=0.2584, pruned_loss=0.04271, over 24331.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2488, pruned_loss=0.04835, over 4713748.78 frames. ], batch size: 74, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:30:53,424 WARNING [train.py:1197] (0/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:30:58,830 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=770360.0, ans=0.1 2023-09-30 16:31:02,217 WARNING [train.py:1197] (0/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:06,911 WARNING [train.py:1197] (0/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:07,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:31:08,444 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:08,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 16:31:08,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:31:10,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 16:31:10,711 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:31:10,732 WARNING [train.py:1197] (0/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 16:31:14,257 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:17,442 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:31:17,464 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:31:17,469 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:31:17,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:17,518 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:31:17,823 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=770426.6666666666, ans=0.0 2023-09-30 16:31:19,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:31:20,692 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 16:31:20,821 WARNING [train.py:1197] (0/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:31:22,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:23,937 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 16:31:25,329 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:31:25,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:25,719 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=770493.3333333334, ans=0.125 2023-09-30 16:31:32,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 16:31:33,541 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:34,782 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.863e+02 2.026e+02 2.291e+02 3.253e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-30 16:31:37,273 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:31:38,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:31:38,589 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=770560.0, ans=0.0 2023-09-30 16:31:39,735 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 16:31:41,329 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:31:42,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 16:31:42,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:31:44,359 WARNING [train.py:1197] (0/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:44,480 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:31:46,825 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:31:46,876 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:31:48,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:49,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 16:31:49,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:50,756 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 16:31:55,524 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:32:00,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:32:01,650 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:32:01,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:03,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:32:03,690 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:04,293 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.52 vs. limit=22.5 2023-09-30 16:32:08,809 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:08,944 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=770626.6666666666, ans=0.035 2023-09-30 16:32:11,921 INFO [train.py:1039] (0/4) Epoch 22, batch 4050, loss[loss=0.1785, simple_loss=0.2683, pruned_loss=0.04438, over 24684.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2497, pruned_loss=0.04885, over 4707752.67 frames. ], batch size: 73, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:32:13,420 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:32:13,768 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=770693.3333333334, ans=0.0 2023-09-30 16:32:14,909 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 16:32:15,153 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:32:16,508 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:32:16,649 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:32:18,157 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:18,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:24,269 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:28,606 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:32:29,165 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.98 vs. limit=15.0 2023-09-30 16:32:30,091 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:32:31,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:32:33,101 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:32:36,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:36,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:39,779 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 16:32:41,872 WARNING [train.py:1197] (0/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 16:32:43,844 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 16:32:46,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:32:54,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 16:32:55,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:32:58,402 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:33:01,218 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:04,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:33:05,067 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:33:06,349 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:09,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:33:11,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 16:33:11,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:33:12,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:14,196 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 16:33:19,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:26,126 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 16:33:27,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:33:27,617 WARNING [train.py:1197] (0/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:33:29,380 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=770960.0, ans=0.125 2023-09-30 16:33:30,680 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 16:33:30,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 16:33:30,695 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:34,169 INFO [train.py:1039] (0/4) Epoch 22, batch 4100, loss[loss=0.1619, simple_loss=0.2466, pruned_loss=0.03859, over 24666.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2501, pruned_loss=0.04898, over 4710644.52 frames. ], batch size: 65, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:33:34,321 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:33:36,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:36,394 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:33:43,981 WARNING [train.py:1197] (0/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 16:33:44,302 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=771026.6666666666, ans=0.2 2023-09-30 16:33:44,600 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=22.5 2023-09-30 16:33:45,501 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 16:33:47,122 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 16:33:48,766 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 16:33:48,787 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:48,872 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,198 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,242 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:33:50,541 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=771093.3333333334, ans=0.0 2023-09-30 16:33:51,784 WARNING [train.py:1197] (0/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 16:33:55,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:33:55,737 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:33:55,761 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:55,939 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=771093.3333333334, ans=0.125 2023-09-30 16:33:57,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:33:57,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=771093.3333333334, ans=0.0 2023-09-30 16:33:58,260 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.09 vs. limit=15.0 2023-09-30 16:34:00,456 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:34:02,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:34:02,089 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:34:03,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 16:34:03,530 WARNING [train.py:1197] (0/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:03,537 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:34:03,557 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:03,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:34:04,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 16:34:05,394 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=771160.0, ans=0.0 2023-09-30 16:34:07,423 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=771160.0, ans=0.0 2023-09-30 16:34:08,671 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:08,834 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 16:34:10,319 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:34:11,933 WARNING [train.py:1197] (0/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:11,935 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 16:34:12,738 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.54 vs. limit=22.5 2023-09-30 16:34:13,414 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:34:14,877 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:34:16,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:34:16,663 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=771160.0, ans=0.125 2023-09-30 16:34:17,770 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 16:34:19,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:34:20,907 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:34:21,175 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771226.6666666666, ans=0.1 2023-09-30 16:34:22,265 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.433e+02 1.834e+02 2.084e+02 2.360e+02 3.426e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 16:34:22,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 16:34:24,523 WARNING [train.py:1197] (0/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:24,608 WARNING [train.py:1197] (0/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:27,625 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:35,169 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:34:35,535 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=771226.6666666666, ans=0.125 2023-09-30 16:34:38,378 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:39,860 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:34:48,835 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:34:48,858 WARNING [train.py:1197] (0/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:53,296 WARNING [train.py:1197] (0/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:53,538 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:34:54,882 INFO [train.py:1039] (0/4) Epoch 22, batch 4150, loss[loss=0.1748, simple_loss=0.2644, pruned_loss=0.04263, over 24095.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2505, pruned_loss=0.04942, over 4700708.09 frames. ], batch size: 80, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:34:55,464 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:34:56,730 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:58,609 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:34:58,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:34:58,739 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:35:02,376 WARNING [train.py:1197] (0/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 16:35:02,432 WARNING [train.py:1197] (0/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:02,701 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=771360.0, ans=0.125 2023-09-30 16:35:03,863 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 16:35:03,973 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 16:35:04,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 16:35:06,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:10,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:35:10,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:15,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:17,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:18,962 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:35:19,185 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:35:19,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:35:20,791 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:35:25,363 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:28,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:30,048 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 16:35:32,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 16:35:32,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:35:35,352 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 16:35:35,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:35:35,393 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:38,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:39,940 WARNING [train.py:1197] (0/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:43,158 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 16:35:47,071 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:35:48,682 WARNING [train.py:1197] (0/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:35:48,797 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 16:35:50,749 WARNING [train.py:1197] (0/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:52,302 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 16:35:52,747 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=771560.0, ans=0.125 2023-09-30 16:35:53,782 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:35:55,282 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:56,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:58,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 16:35:58,402 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:58,406 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:36:00,046 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:36:02,060 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=11.98 vs. limit=15.0 2023-09-30 16:36:03,030 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 16:36:04,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:04,351 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:36:04,384 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:36:04,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 16:36:04,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:36:04,592 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:36:06,580 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:36:08,190 WARNING [train.py:1197] (0/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:08,223 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 16:36:09,600 WARNING [train.py:1197] (0/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:36:15,663 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:36:15,885 WARNING [train.py:1197] (0/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 16:36:17,303 INFO [train.py:1039] (0/4) Epoch 22, batch 4200, loss[loss=0.1422, simple_loss=0.2245, pruned_loss=0.02997, over 24308.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2493, pruned_loss=0.04924, over 4685881.05 frames. ], batch size: 61, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:36:18,725 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.92 vs. limit=15.0 2023-09-30 16:36:19,496 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:36:21,631 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.01 vs. limit=12.0 2023-09-30 16:36:23,143 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:23,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:36:23,579 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=771693.3333333334, ans=0.0 2023-09-30 16:36:24,754 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:24,757 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:27,767 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 16:36:30,913 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 16:36:30,984 WARNING [train.py:1197] (0/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:33,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:34,404 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=771760.0, ans=0.04949747468305833 2023-09-30 16:36:35,713 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771760.0, ans=0.1 2023-09-30 16:36:36,923 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:36:40,479 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:36:40,693 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:36:40,732 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:42,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 16:36:42,844 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:44,407 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:45,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:45,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:36:47,510 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:36:47,808 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=771760.0, ans=0.125 2023-09-30 16:36:50,539 WARNING [train.py:1197] (0/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 16:36:51,932 WARNING [train.py:1197] (0/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:53,752 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=771826.6666666666, ans=0.0 2023-09-30 16:36:57,207 WARNING [train.py:1197] (0/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:36:57,350 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:37:00,389 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:37:00,567 WARNING [train.py:1197] (0/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:00,909 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=771826.6666666666, ans=0.1 2023-09-30 16:37:03,613 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:37:03,637 WARNING [train.py:1197] (0/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 16:37:03,672 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:05,229 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:37:06,545 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.822e+02 2.029e+02 2.265e+02 3.199e+02, threshold=4.057e+02, percent-clipped=0.0 2023-09-30 16:37:11,381 WARNING [train.py:1197] (0/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:37:13,548 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:20,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:37:23,345 WARNING [train.py:1197] (0/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 16:37:24,952 WARNING [train.py:1197] (0/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:30,247 WARNING [train.py:1197] (0/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:37:30,357 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:34,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 16:37:37,778 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:37:40,682 INFO [train.py:1039] (0/4) Epoch 22, batch 4250, loss[loss=0.1546, simple_loss=0.2337, pruned_loss=0.03773, over 24314.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2482, pruned_loss=0.04869, over 4688603.78 frames. ], batch size: 61, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:37:42,903 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:43,150 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=772026.6666666666, ans=0.125 2023-09-30 16:37:44,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:37:47,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:47,760 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=772026.6666666666, ans=0.0 2023-09-30 16:37:50,583 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:37:50,652 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 16:37:52,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:55,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:57,761 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=772093.3333333334, ans=0.125 2023-09-30 16:37:58,924 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:38:02,730 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:03,900 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:06,270 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:38:06,274 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:07,855 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:09,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:10,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:12,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:38:14,032 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:14,293 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=772160.0, ans=0.1 2023-09-30 16:38:16,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 16:38:19,374 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=772160.0, ans=0.1 2023-09-30 16:38:20,471 WARNING [train.py:1197] (0/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 16:38:20,483 WARNING [train.py:1197] (0/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:20,589 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:38:20,623 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:22,125 WARNING [train.py:1197] (0/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:38:22,130 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:22,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:26,785 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 16:38:28,893 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:38:32,209 WARNING [train.py:1197] (0/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:33,683 WARNING [train.py:1197] (0/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:35,128 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 16:38:35,140 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:38:37,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 16:38:38,891 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:38:41,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:38:44,211 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:44,263 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:45,937 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 16:38:47,509 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:38:47,601 WARNING [train.py:1197] (0/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:38:52,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:55,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:55,985 WARNING [train.py:1197] (0/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:38:56,316 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=772293.3333333334, ans=0.5 2023-09-30 16:38:57,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:59,098 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:00,051 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.49 vs. limit=6.0 2023-09-30 16:39:01,367 WARNING [train.py:1197] (0/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:39:01,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:01,468 WARNING [train.py:1197] (0/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 16:39:03,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:04,427 INFO [train.py:1039] (0/4) Epoch 22, batch 4300, loss[loss=0.171, simple_loss=0.2493, pruned_loss=0.04639, over 24428.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2475, pruned_loss=0.04827, over 4691770.19 frames. ], batch size: 63, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:39:05,355 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.06 vs. limit=15.0 2023-09-30 16:39:08,120 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.79 vs. limit=15.0 2023-09-30 16:39:09,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:11,162 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:14,310 WARNING [train.py:1197] (0/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:23,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:39:23,438 WARNING [train.py:1197] (0/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 16:39:24,973 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:39:26,579 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:39:26,609 WARNING [train.py:1197] (0/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:39:26,630 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 16:39:31,662 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:39:33,276 WARNING [train.py:1197] (0/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:39:38,437 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 16:39:38,458 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:39:38,493 WARNING [train.py:1197] (0/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 16:39:41,620 WARNING [train.py:1197] (0/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:39:41,817 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:39:43,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:39:43,513 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:43,748 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=772493.3333333334, ans=0.125 2023-09-30 16:39:45,061 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:39:48,575 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:48,736 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:49,389 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.27 vs. limit=15.0 2023-09-30 16:39:50,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 16:39:50,202 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 16:39:51,837 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:39:52,120 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=772560.0, ans=0.125 2023-09-30 16:39:53,166 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.888e+02 2.133e+02 2.440e+02 3.863e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 16:39:54,225 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.29 vs. limit=6.0 2023-09-30 16:39:54,976 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:54,989 WARNING [train.py:1197] (0/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:39:55,012 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:55,075 WARNING [train.py:1197] (0/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:55,093 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 16:39:55,095 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 16:39:56,667 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 16:39:58,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:39:58,205 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 16:39:58,255 WARNING [train.py:1197] (0/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 16:40:02,812 WARNING [train.py:1197] (0/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:05,195 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 16:40:05,276 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:40:08,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:08,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:10,548 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=772626.6666666666, ans=0.0 2023-09-30 16:40:11,823 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 16:40:11,930 WARNING [train.py:1197] (0/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:40:11,936 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:13,333 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:13,394 WARNING [train.py:1197] (0/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:13,480 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:40:16,547 WARNING [train.py:1197] (0/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:40:18,215 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:19,002 WARNING [train.py:1197] (0/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:20,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:25,319 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 16:40:26,609 INFO [train.py:1039] (0/4) Epoch 22, batch 4350, loss[loss=0.1748, simple_loss=0.246, pruned_loss=0.05177, over 23643.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2483, pruned_loss=0.04848, over 4700990.45 frames. ], batch size: 149, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:40:26,710 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:40:31,266 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:40:34,327 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:36,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:40:36,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:40:41,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:40:44,855 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:46,540 WARNING [train.py:1197] (0/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:40:46,564 WARNING [train.py:1197] (0/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:49,660 WARNING [train.py:1197] (0/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:40:53,197 WARNING [train.py:1197] (0/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:40:55,429 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:41:00,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 16:41:01,773 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:01,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:07,950 WARNING [train.py:1197] (0/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:11,430 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 16:41:13,686 WARNING [train.py:1197] (0/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:15,221 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:41:19,794 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 16:41:21,328 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:21,409 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:41:22,931 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 16:41:23,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=772893.3333333334, ans=0.02 2023-09-30 16:41:24,402 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 16:41:24,423 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:24,467 WARNING [train.py:1197] (0/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:24,579 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:41:26,584 WARNING [train.py:1197] (0/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:27,883 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:27,965 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:41:31,096 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 16:41:31,116 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:31,119 WARNING [train.py:1197] (0/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:32,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:32,656 WARNING [train.py:1197] (0/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 16:41:35,531 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 16:41:35,538 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 16:41:35,554 WARNING [train.py:1197] (0/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 16:41:37,453 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=772960.0, ans=0.1 2023-09-30 16:41:38,758 WARNING [train.py:1197] (0/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:41:38,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:41:38,823 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:41:39,212 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=772960.0, ans=0.0 2023-09-30 16:41:40,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:41:41,904 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 16:41:44,894 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 16:41:44,906 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:49,151 INFO [train.py:1039] (0/4) Epoch 22, batch 4400, loss[loss=0.1803, simple_loss=0.2481, pruned_loss=0.05621, over 23677.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2496, pruned_loss=0.04921, over 4690296.70 frames. ], batch size: 164, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:41:49,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:41:49,369 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:53,713 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:55,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 16:41:55,340 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 16:41:55,630 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=773026.6666666666, ans=0.0 2023-09-30 16:41:56,821 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 16:41:56,876 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 16:41:58,368 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:41:58,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:42:01,309 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 16:42:03,879 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=773093.3333333334, ans=0.1 2023-09-30 16:42:05,645 WARNING [train.py:1197] (0/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:07,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:07,132 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 16:42:08,940 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:08,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 16:42:09,029 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 16:42:12,262 WARNING [train.py:1197] (0/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 16:42:13,720 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 16:42:13,762 WARNING [train.py:1197] (0/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 16:42:13,819 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:15,353 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:15,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:18,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:18,439 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 16:42:18,451 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 16:42:19,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:20,280 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=773160.0, ans=0.125 2023-09-30 16:42:22,155 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:42:22,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:23,727 WARNING [train.py:1197] (0/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:25,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:25,261 WARNING [train.py:1197] (0/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 16:42:25,410 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 16:42:28,926 WARNING [train.py:1197] (0/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:37,032 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:38,239 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.760e+02 1.970e+02 2.320e+02 3.797e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 16:42:39,890 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 16:42:44,450 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:42:46,112 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:42:47,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:42:49,228 WARNING [train.py:1197] (0/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 16:42:49,273 WARNING [train.py:1197] (0/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:42:49,291 WARNING [train.py:1197] (0/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:42:49,295 WARNING [train.py:1197] (0/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:42:50,868 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:42:55,443 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 16:42:58,887 WARNING [train.py:1197] (0/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 16:43:00,991 WARNING [train.py:1197] (0/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 16:43:01,016 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:01,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 16:43:02,866 INFO [checkpoint.py:75] (0/4) Saving checkpoint to zipformer/exp-w-tal-csasr/checkpoint-116000.pt 2023-09-30 16:43:06,016 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:43:07,420 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=773293.3333333334, ans=0.125 2023-09-30 16:43:08,864 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:43:11,806 WARNING [train.py:1197] (0/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 16:43:13,082 INFO [train.py:1039] (0/4) Epoch 22, batch 4450, loss[loss=0.1414, simple_loss=0.2188, pruned_loss=0.03202, over 17319.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2508, pruned_loss=0.04965, over 4687986.19 frames. ], batch size: 37, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:43:13,431 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=773360.0, ans=0.0 2023-09-30 16:43:16,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:43:17,179 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=773360.0, ans=0.2 2023-09-30 16:43:18,639 WARNING [train.py:1197] (0/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:20,178 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:43:22,690 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=8.93 vs. limit=22.5 2023-09-30 16:43:24,998 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:43:25,041 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:43:28,147 WARNING [train.py:1197] (0/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:30,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:43:33,192 WARNING [train.py:1197] (0/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:43:33,235 WARNING [train.py:1197] (0/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:33,530 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=773426.6666666666, ans=0.0 2023-09-30 16:43:36,735 WARNING [train.py:1197] (0/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 16:43:36,737 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:36,859 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:36,917 WARNING [train.py:1197] (0/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:43:36,918 WARNING [train.py:1197] (0/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:43:39,943 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:43:47,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:47,489 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:48,939 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:51,085 WARNING [train.py:1197] (0/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:51,233 WARNING [train.py:1197] (0/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:43:56,478 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:43:58,035 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 16:43:58,061 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 16:43:58,069 WARNING [train.py:1197] (0/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:44:00,979 WARNING [train.py:1197] (0/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:02,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 16:44:06,972 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:44:10,567 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:10,661 WARNING [train.py:1197] (0/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 16:44:10,698 WARNING [train.py:1197] (0/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:10,703 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:12,131 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:44:12,142 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:13,702 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:13,931 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=773560.0, ans=0.0 2023-09-30 16:44:16,788 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:44:16,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 16:44:18,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:44:19,992 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:44:21,494 WARNING [train.py:1197] (0/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:23,070 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:23,757 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:44:26,525 WARNING [train.py:1197] (0/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:44:30,216 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 16:44:30,676 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=773626.6666666666, ans=0.1 2023-09-30 16:44:31,891 WARNING [train.py:1197] (0/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:44:34,905 INFO [train.py:1039] (0/4) Epoch 22, batch 4500, loss[loss=0.1761, simple_loss=0.2461, pruned_loss=0.05311, over 24611.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2509, pruned_loss=0.04936, over 4691258.54 frames. ], batch size: 60, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:44:36,648 WARNING [train.py:1197] (0/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:36,807 WARNING [train.py:1197] (0/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 16:44:36,809 WARNING [train.py:1197] (0/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 16:44:36,981 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=773693.3333333334, ans=0.1 2023-09-30 16:44:39,871 WARNING [train.py:1197] (0/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:44:45,332 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:45,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=773693.3333333334, ans=0.2 2023-09-30 16:44:46,669 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:46,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:44:48,240 WARNING [train.py:1197] (0/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:44:49,667 WARNING [train.py:1197] (0/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:44:49,748 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:44:50,079 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=773760.0, ans=0.125 2023-09-30 16:44:56,169 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=773760.0, ans=0.1 2023-09-30 16:45:04,440 WARNING [train.py:1197] (0/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:45:05,942 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:45:09,153 WARNING [train.py:1197] (0/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:09,239 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:45:10,758 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:45:14,146 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=773826.6666666666, ans=0.125 2023-09-30 16:45:15,595 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:45:22,709 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:45:24,239 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.959e+02 2.152e+02 2.408e+02 4.470e+02, threshold=4.304e+02, percent-clipped=1.0 2023-09-30 16:45:26,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:45:29,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:45:29,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 16:45:30,574 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:30,647 WARNING [train.py:1197] (0/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:32,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:32,345 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:34,797 WARNING [train.py:1197] (0/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:45:34,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 16:45:34,833 WARNING [train.py:1197] (0/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:45:34,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:39,970 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:45:40,013 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:45:43,214 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:47,551 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:45:47,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:45:49,102 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 16:45:51,520 WARNING [train.py:1197] (0/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 16:45:51,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 16:45:55,137 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 16:45:58,082 INFO [train.py:1039] (0/4) Epoch 22, batch 4550, loss[loss=0.1756, simple_loss=0.2603, pruned_loss=0.0455, over 23663.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2494, pruned_loss=0.04872, over 4694031.39 frames. ], batch size: 85, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:45:58,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 16:45:59,678 WARNING [train.py:1197] (0/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:00,306 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.96 vs. limit=22.5 2023-09-30 16:46:02,769 WARNING [train.py:1197] (0/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:04,163 WARNING [train.py:1197] (0/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:07,803 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:14,503 WARNING [train.py:1197] (0/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:46:16,127 WARNING [train.py:1197] (0/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:46:19,109 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:19,113 WARNING [train.py:1197] (0/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:46:19,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:20,770 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:20,832 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:24,079 WARNING [train.py:1197] (0/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:46:28,193 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 16:46:28,302 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 16:46:28,417 WARNING [train.py:1197] (0/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:46:29,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 16:46:31,742 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=774160.0, ans=0.125 2023-09-30 16:46:34,402 WARNING [train.py:1197] (0/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 16:46:34,511 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:37,565 WARNING [train.py:1197] (0/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 16:46:39,201 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:46:41,545 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,590 WARNING [train.py:1197] (0/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,611 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:46:44,788 WARNING [train.py:1197] (0/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 16:46:48,331 WARNING [train.py:1197] (0/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:46:51,248 WARNING [train.py:1197] (0/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:51,271 WARNING [train.py:1197] (0/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:52,764 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:54,750 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 16:46:54,852 WARNING [train.py:1197] (0/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 16:46:54,892 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:46:56,422 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 16:46:56,682 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 16:46:58,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:47:00,381 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:00,409 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:01,856 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:01,884 WARNING [train.py:1197] (0/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:47:03,340 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:47:04,784 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 16:47:04,978 WARNING [train.py:1197] (0/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:47:04,993 WARNING [train.py:1197] (0/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:47:06,541 WARNING [train.py:1197] (0/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 16:47:06,552 WARNING [train.py:1197] (0/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:47:06,577 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 16:47:11,189 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:47:11,209 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:47:13,306 WARNING [train.py:1197] (0/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:47:14,846 WARNING [train.py:1197] (0/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:14,897 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:47:16,494 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:47:18,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:47:18,375 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=774293.3333333334, ans=0.125 2023-09-30 16:47:21,101 INFO [train.py:1039] (0/4) Epoch 22, batch 4600, loss[loss=0.1768, simple_loss=0.2459, pruned_loss=0.05381, over 23309.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2486, pruned_loss=0.04774, over 4718281.30 frames. ], batch size: 119, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:47:21,187 WARNING [train.py:1197] (0/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:21,318 WARNING [train.py:1197] (0/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:24,952 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:47:24,974 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:47:25,080 WARNING [train.py:1197] (0/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:27,217 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 16:47:30,068 WARNING [train.py:1197] (0/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:47:30,741 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.71 vs. limit=22.5 2023-09-30 16:47:32,568 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=774360.0, ans=0.1 2023-09-30 16:47:35,280 WARNING [train.py:1197] (0/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:47:36,854 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:41,297 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:48,426 WARNING [train.py:1197] (0/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 16:47:49,921 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:51,757 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=774426.6666666666, ans=0.125 2023-09-30 16:47:54,400 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:56,276 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=774493.3333333334, ans=0.0 2023-09-30 16:47:57,436 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:47:57,449 WARNING [train.py:1197] (0/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:02,355 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 16:48:02,356 WARNING [train.py:1197] (0/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:48:04,268 WARNING [train.py:1197] (0/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:08,114 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=774493.3333333334, ans=0.125 2023-09-30 16:48:11,414 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.811e+02 1.994e+02 2.205e+02 2.930e+02, threshold=3.988e+02, percent-clipped=0.0 2023-09-30 16:48:11,527 WARNING [train.py:1197] (0/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:11,611 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:48:13,103 WARNING [train.py:1197] (0/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:48:16,399 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 16:48:19,267 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:48:19,577 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=774560.0, ans=0.0 2023-09-30 16:48:24,570 WARNING [train.py:1197] (0/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:26,114 WARNING [train.py:1197] (0/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:48:29,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:29,179 WARNING [train.py:1197] (0/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 16:48:29,232 WARNING [train.py:1197] (0/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:29,334 WARNING [train.py:1197] (0/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 16:48:29,612 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=774626.6666666666, ans=0.125 2023-09-30 16:48:30,733 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:30,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:30,984 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:31,608 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.76 vs. limit=10.0 2023-09-30 16:48:32,462 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:34,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:34,104 WARNING [train.py:1197] (0/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 16:48:34,172 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 16:48:35,938 WARNING [train.py:1197] (0/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 16:48:35,948 WARNING [train.py:1197] (0/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:37,534 WARNING [train.py:1197] (0/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:37,623 WARNING [train.py:1197] (0/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:39,573 WARNING [train.py:1197] (0/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:44,930 INFO [train.py:1039] (0/4) Epoch 22, batch 4650, loss[loss=0.1763, simple_loss=0.2535, pruned_loss=0.04956, over 23379.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2485, pruned_loss=0.04769, over 4723560.85 frames. ], batch size: 93, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:48:49,815 WARNING [train.py:1197] (0/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:48:51,454 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:52,925 WARNING [train.py:1197] (0/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:53,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:48:53,058 WARNING [train.py:1197] (0/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:53,095 WARNING [train.py:1197] (0/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:56,400 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:59,475 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 16:49:04,026 WARNING [train.py:1197] (0/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:49:06,895 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 16:49:06,929 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:49:08,462 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 16:49:08,497 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:49:08,583 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 16:49:08,622 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 16:49:08,634 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:10,018 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:49:13,546 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:49:14,999 WARNING [train.py:1197] (0/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:15,701 WARNING [train.py:1197] (0/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 16:49:18,717 WARNING [train.py:1197] (0/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:22,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 16:49:23,791 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:23,805 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:49:25,321 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 16:49:26,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:49:29,899 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:49:33,514 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:49:38,227 WARNING [train.py:1197] (0/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:40,188 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=774893.3333333334, ans=0.0 2023-09-30 16:49:41,382 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:42,742 WARNING [train.py:1197] (0/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:42,813 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:49:45,919 WARNING [train.py:1197] (0/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 16:49:45,982 WARNING [train.py:1197] (0/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 16:49:46,086 WARNING [train.py:1197] (0/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 16:49:46,089 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 16:49:48,309 WARNING [train.py:1197] (0/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:49:56,960 WARNING [train.py:1197] (0/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:49:56,967 WARNING [train.py:1197] (0/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:49:58,372 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 16:49:58,412 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:49:59,987 WARNING [train.py:1197] (0/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:50:00,007 WARNING [train.py:1197] (0/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:50:01,655 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:50:04,796 WARNING [train.py:1197] (0/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:50:04,816 WARNING [train.py:1197] (0/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:50:06,783 INFO [train.py:1039] (0/4) Epoch 22, batch 4700, loss[loss=0.1873, simple_loss=0.2538, pruned_loss=0.06039, over 23753.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2489, pruned_loss=0.04817, over 4727817.45 frames. ], batch size: 212, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:50:06,896 WARNING [train.py:1197] (0/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:50:09,111 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.13 vs. limit=10.0 2023-09-30 16:50:10,171 WARNING [train.py:1197] (0/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:10,237 WARNING [train.py:1197] (0/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:50:10,251 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:50:11,864 WARNING [train.py:1197] (0/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 16:50:12,005 WARNING [train.py:1197] (0/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:50:13,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 16:50:16,897 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=775026.6666666666, ans=0.125 2023-09-30 16:50:21,969 WARNING [train.py:1197] (0/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:23,460 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:23,528 WARNING [train.py:1197] (0/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:50:25,019 WARNING [train.py:1197] (0/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:25,271 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:50:26,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:50:26,906 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=775093.3333333334, ans=0.2 2023-09-30 16:50:31,951 WARNING [train.py:1197] (0/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 16:50:32,001 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 16:50:33,825 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=775093.3333333334, ans=0.125 2023-09-30 16:50:35,159 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:36,783 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:50:36,826 WARNING [train.py:1197] (0/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:50:42,003 WARNING [train.py:1197] (0/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:47,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=775160.0, ans=0.0 2023-09-30 16:50:48,250 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:50:49,860 WARNING [train.py:1197] (0/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:50:51,361 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:55,718 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.872e+02 2.138e+02 2.585e+02 4.153e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 16:50:56,105 INFO [scaling.py:1118] (0/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:50:58,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 16:50:58,238 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:51:01,888 WARNING [train.py:1197] (0/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:06,754 WARNING [train.py:1197] (0/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 16:51:08,418 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:51:11,626 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:51:13,091 WARNING [train.py:1197] (0/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 16:51:14,708 WARNING [train.py:1197] (0/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:14,733 WARNING [train.py:1197] (0/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:18,618 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=775293.3333333334, ans=0.125 2023-09-30 16:51:19,947 WARNING [train.py:1197] (0/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:51:20,017 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:51:20,056 WARNING [train.py:1197] (0/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 16:51:21,623 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 16:51:23,294 WARNING [train.py:1197] (0/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:26,264 WARNING [train.py:1197] (0/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,265 WARNING [train.py:1197] (0/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 16:51:26,418 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:29,390 INFO [train.py:1039] (0/4) Epoch 22, batch 4750, loss[loss=0.1691, simple_loss=0.2479, pruned_loss=0.04509, over 23510.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2499, pruned_loss=0.04859, over 4705082.69 frames. ], batch size: 106, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:51:29,921 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=775360.0, ans=0.0 2023-09-30 16:51:31,120 WARNING [train.py:1197] (0/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 16:51:34,886 WARNING [train.py:1197] (0/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:51:35,202 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=775360.0, ans=0.2 2023-09-30 16:51:36,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:37,643 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=775360.0, ans=0.1 2023-09-30 16:51:40,507 WARNING [train.py:1197] (0/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:40,543 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:51:43,015 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 16:51:43,073 WARNING [train.py:1197] (0/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:51:44,791 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=775360.0, ans=0.1 2023-09-30 16:51:47,502 WARNING [train.py:1197] (0/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 16:51:47,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:51:47,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:49,170 WARNING [train.py:1197] (0/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:51:56,077 WARNING [train.py:1197] (0/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 16:51:59,388 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:52:02,377 WARNING [train.py:1197] (0/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 16:52:02,496 WARNING [train.py:1197] (0/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:03,059 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.04 vs. limit=10.0 2023-09-30 16:52:05,596 WARNING [train.py:1197] (0/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:05,600 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:05,630 WARNING [train.py:1197] (0/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:09,009 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 16:52:09,014 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 16:52:12,843 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 16:52:15,256 WARNING [train.py:1197] (0/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:18,191 WARNING [train.py:1197] (0/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:18,660 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=775493.3333333334, ans=0.1 2023-09-30 16:52:19,824 WARNING [train.py:1197] (0/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:52:19,825 WARNING [train.py:1197] (0/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 16:52:19,832 WARNING [train.py:1197] (0/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:23,014 WARNING [train.py:1197] (0/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:52:26,924 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=775560.0, ans=0.125 2023-09-30 16:52:28,203 WARNING [train.py:1197] (0/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:52:31,241 WARNING [train.py:1197] (0/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 16:52:31,301 WARNING [train.py:1197] (0/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 16:52:32,783 WARNING [train.py:1197] (0/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:32,833 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:52:34,397 WARNING [train.py:1197] (0/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:34,516 WARNING [train.py:1197] (0/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:52:34,546 WARNING [train.py:1197] (0/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 16:52:37,561 WARNING [train.py:1197] (0/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 16:52:40,772 WARNING [train.py:1197] (0/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:52:42,471 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:52:42,474 WARNING [train.py:1197] (0/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 16:52:42,536 WARNING [train.py:1197] (0/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:44,587 WARNING [train.py:1197] (0/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:46,174 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:52:46,272 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:47,740 WARNING [train.py:1197] (0/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:52:51,509 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:52:52,848 WARNING [train.py:1197] (0/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 16:52:52,980 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 16:52:53,208 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=775693.3333333334, ans=0.09899494936611666 2023-09-30 16:52:54,285 INFO [train.py:1039] (0/4) Epoch 22, batch 4800, loss[loss=0.196, simple_loss=0.2606, pruned_loss=0.06566, over 23506.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2511, pruned_loss=0.04898, over 4712735.55 frames. ], batch size: 285, lr: 4.63e-03, grad_scale: 32.0 2023-09-30 16:52:54,517 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 16:52:57,688 WARNING [train.py:1197] (0/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:52:59,114 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:00,714 WARNING [train.py:1197] (0/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 16:53:06,000 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:06,070 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:10,878 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:53:12,434 WARNING [train.py:1197] (0/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:12,477 WARNING [train.py:1197] (0/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:12,568 WARNING [train.py:1197] (0/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 16:53:14,066 WARNING [train.py:1197] (0/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:53:14,136 WARNING [train.py:1197] (0/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:53:16,517 WARNING [train.py:1197] (0/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:53:23,597 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:25,106 WARNING [train.py:1197] (0/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:25,167 WARNING [train.py:1197] (0/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:53:26,753 WARNING [train.py:1197] (0/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:26,781 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:53:26,802 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:28,404 WARNING [train.py:1197] (0/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:30,149 WARNING [train.py:1197] (0/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:33,072 WARNING [train.py:1197] (0/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:33,372 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=775826.6666666666, ans=0.125 2023-09-30 16:53:36,881 WARNING [train.py:1197] (0/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:36,914 WARNING [train.py:1197] (0/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:53:37,160 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=775826.6666666666, ans=0.0 2023-09-30 16:53:38,445 WARNING [train.py:1197] (0/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:53:41,348 WARNING [train.py:1197] (0/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:41,562 WARNING [train.py:1197] (0/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 16:53:43,056 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 16:53:43,192 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:43,225 WARNING [train.py:1197] (0/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:53:44,493 INFO [optim.py:468] (0/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.908e+02 2.098e+02 2.406e+02 3.815e+02, threshold=4.197e+02, percent-clipped=0.0 2023-09-30 16:53:44,697 WARNING [train.py:1197] (0/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:53:44,707 WARNING [train.py:1197] (0/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:44,722 WARNING [train.py:1197] (0/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:53:46,298 WARNING [train.py:1197] (0/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:53:46,388 WARNING [train.py:1197] (0/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:47,110 INFO [scaling.py:1022] (0/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.09 vs. limit=10.0 2023-09-30 16:53:51,123 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:54,853 WARNING [train.py:1197] (0/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:55,053 WARNING [train.py:1197] (0/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:53:57,805 INFO [scaling.py:213] (0/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=775893.3333333334, ans=0.125